Dec. 3, 2024
At work, it's time for Continuous integration (CI) in projects slightly larger than a regular website. Usually, I use Bitbucket as a code repository, so I decided not to look far for Continuous integration (CI) tools. Instead of getting to know TravisCI or CircleCI, my choice fell on bitbucket pipelines. Unfortunately, right at the start, with the migrations themselves, there was a problem with an error
django.db.utils.OperationalError: (1366, "Incorrect string value: '\\xC5\\xBCytko...' for column 'name' at row 1")
It was clear that the Polish design and Polish diacritics were a problem (the 'user' field). The database is created by default in latin1_swedish_ci. I struggled for a long time looking for information on creating a database with the correct utf8 character set. Of course, I started looking for information on how to do it by default, something more than CHARSET: UTF8 in database options, because it did nothing anyway - the database was created in Swedish coding anyway:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': 'test_pipelines',
'USER': 'test_user',
'PASSWORD': 'test_user_password',
'HOST': '127.0.0.1',
'PORT': '3306',
'OPTIONS': {
"init_command": "SET default_storage_engine=MyISAM",
},
'TEST': {
'NAME': 'test_pipelines',
'CHARSET': 'UTF8',
},
}
}
I did not want to believe that there is nothing more than the advice from the link https://confluence.atlassian.com/bitbucket/test-with-databases-in-bitbucket-pipelines-856697462.html:
definitions:
services:
mysql:
image: mysql
environment:
MYSQL_DATABASE: 'pipelines'
MYSQL_RANDOM_ROOT_PASSWORD: 'yes'
MYSQL_USER: 'test_user'
MYSQL_PASSWORD: 'test_user_password'
I finally found two solutions to the problem. Run mysqld with parameters as below according to the instructions at https://kierenpitts.com/blog/2017/05/testing-django-applications-with-bitbucket-pipelines-and-mysql/
definitions:
services:
mysql:
image: mysql
command: mysqld --character-set-server=utf8 --collation-server=utf8_polish_ci --default-storage-engine=MyISAM
environment:
MYSQL_DATABASE: 'test_pipelines'
MYSQL_RANDOM_ROOT_PASSWORD: 'yes'
MYSQL_USER: 'test_user'
MYSQL_PASSWORD: 'test_user_password'
Unfortunately, this solution did not work. In the 'build,' it is not visible to create such a base with such coding.
I was left with the second solution. Create a database and use a script to set the appropriate encoding, as described here (https://josefottosson.se/change-collation-to-utf-8-on-all-tables-with-django-mysql/). The scripts must run before migration can run. So a script was created to change the base to utf8_polish_ci:
import sys
from project import settings
from django.db import connection
import os
sys.path.append(settings.BASE_DIR)
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "project.settings_pipelines")
def init():
cursor = connection.cursor()
tables = connection.introspection.table_names()
sql = "ALTER DATABASE pipelines CHARACTER SET utf8 COLLATE utf8_polish_ci;"
cursor.execute(sql)
for table in tables:
print "Fixing table: %s" %table
sql = "ALTER TABLE %s CONVERT TO CHARACTER SET utf8;" %(table)
cursor.execute(sql)
print "Table %s set to utf8"%table
print "DONE!"
init()
Ultimately, my bitbucket-pipelines.yml looks something like this:
image: python:2.7
pipelines:
default:
- step:
caches:
- pip
services:
- mysql
script: # Modify the commands below to build your repository.
- pip install --upgrade pip
- pip install six
- pip install -r requirements.txt
- export DJANGO_SETTINGS_MODULE=project.settings_pipelines
- python pipeline_database_conversion.py
- python manage.py migrate --settings=project.settings_pipelines
- python manage.py migrate --database=historical --settings=project.settings_pipelines
definitions:
services:
mysql:
image: mysql
environment:
MYSQL_DATABASE: 'pipelines'
MYSQL_RANDOM_ROOT_PASSWORD: 'yes'
MYSQL_USER: 'test_user'
MYSQL_PASSWORD: 'test_user_password'
MYSQL_DEFAULT_CHARACTER_SET: 'utf8'
Please do not be surprised by two migrations because this project uses 2 different databases (on dev or production). Still, for testing purposes, I created only one database that contains all the tables. In the end, the Pytest tests work nicely.
As for the script that changes the coding in the database, most people probably need a part without iterating through the tables:
def init():
cursor = connection.cursor()
tables = connection.introspection.table_names()
sql = "ALTER DATABASE pipelines CHARACTER SET utf8 COLLATE utf8_polish_ci;"
cursor.execute(sql)
It took me a long time to get to where everything works, so I decided to describe it. I hope this helps other Django / python developers to integrate bitbucket and pipelines more easily, especially those of Polish origin.