Automatic Data Seeding using CSV and Custom django-admin Command
step-by-step on using django-admin custom command to automate seeding data to our database
Why? Do we really need it?
In the modern landscape of web development, database management often poses significant challenges. One of the most common ones is the need for establishing initial data in databases, especially in new instances or during schema migration. In the nasty complex database relationship, we don’t want to insert our data manually did’t we? This task can be arduous, considering that databases can contain millions of records unreadable relation between tables that we didn’t even realized it exists. To circumvent this problem, developers employ the strategy of automatic data seeding. This technique is particularly relevant when developing, testing, or implementing a new database. It enables developers to insert sample data swiftly into the database, allowing for thorough testing and early detection of potential issues. Moreover, automatic data seeding provides consistency across development, testing, and production environments, as it ensures all team members work with identical data structures. This consistency significantly reduces the risk of human error and streamlines the development process. Additionally, it conserves time by eliminating the need for manual data entry, thereby increasing productivity. In this article, we will create our custom django-admin command to automate data seeding in the project.
Introduction on django-admin
Django equips developers with a potent tool known as django-admin
for managing administrative tasks, including data seeding. This command-line utility executes various tasks, such as creating Django projects or apps, running tests, and migrating databases, can be invoked using django-admin.py or the classing manage.py. A nice thing about it is that we can also add our own commands. Lets start with familiarize our self with the django-admin cli. The most widely used commands are startproject
, runserver
or collectstatic
all other commands can be seen through the command below:
python manage.py help
Configure new app for data seeding
To set up a custom django-admin command, lets first we Initialize the app on the django project named data_seed:
python manage.py startapp data_seed
Add our app to the project/settings.py
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'corsheaders',
'rest_framework',
'data_seed', # <-- our new data_seed app
]
Create a model named DummyShip on data_seed/models.py
from django.db import models
# Create your models here.
class DummyShip(models.Model):
ship_name = models.CharField(max_length=50)
ship_total_vessels = models.IntegerField()
def __str__(self):
return f'{self.ship_name}'
now that we have a new model, create a db migration using:
python3 manage.py makemigrations
migrate new model to the db
python3 manage.py migrate
The migration applied to our database, for our convenience, lets register the DummyShip to the admin page so we can access it with the web interface, on the data_seed/admin.py write:
from django.contrib import admin
from .models import DummyShip
admin.site.register(DummyShip)
Lets check that our model is ready to be used, run the django server:
python3 manage.py runserver
open http://127.0.0.1:<port>/admin (note that the <port> depend on your local machine) and make sure we have our model
django-admin custom command
Now that our app is ready, let’s start working on the creation of Custom Django Management Commands:
- Create a folder management/commands on the data_seed apps
- Create a file with the command that we want to register, in this context we want to import a data, name it import_data.py write code to the file:
class Command(BaseCommand):
help = "Seed data from CSV files"
def add_arguments(self, parser):
parser.add_argument('file_path', type=str, help='Path to the CSV file')
def handle(self, *args, **kwargs):
file_path = kwargs['file_path']
try:
with open(file_path, 'r') as csvfile:
csv_reader = reader(csvfile, delimiter=';')
next(csv_reader) # skip header row
dummies = [DummyShip() for row in csv_reader]
DummyShip.objects.bulk_create(dummies)
self.stdout.write(self.style.SUCCESS('Data imported successfully'))
except Exception:
self.stdout.write(self.style.ERROR('Error importing data'))
This code allows us to import our data on the DummyShip model using CSV file, by this structure our custom command would be:
python3 manage.py import_data <path/to/file/csv>
Let’s breakdown the code:
This is a custom Django management command to import data from a CSV file into your Django application’s database. Let’s break down the code:
class Command(BaseCommand):
The classCommand
inherits from Django'sBaseCommand
class, which is the base class for all Django management commands.help = "Seed data from CSV files"
: This is a short description of what the command does. It's displayed when you runpython manage.py help
.def add_arguments(self, parser):
This method is used to specify the command-line arguments for this management command.parser.add_argument('file_path', type=str, help='Path to the CSV file')
: This line adds an argumentfile_path
that is expected to be a string. This will be the path to the CSV file that contains the data to be imported.def handle(self, *args, **kwargs):
This method is the main logic of the management command. It's called when you run the command.file_path = kwargs['file_path']
: This line retrieves thefile_path
argument that was passed on the command line.with open(file_path, 'r') as csvfile:
This line opens the CSV file.csv_reader = reader(csvfile, delimiter=';')
: This line creates a CSV reader object that will allow you to iterate over the rows in the CSV file. The delimiter is specified as;
.next(csv_reader)
: This line skips the header row of the CSV file.dummies = [DummyShip() for row in csv_reader]
: This line creates a list ofDummyShip
objects. For each row in the CSV file, it creates a newDummyShip
object. Note that this line doesn't do anything with the data in each row -- it just creates an emptyDummyShip
object.DummyShip.objects.bulk_create(dummies)
: This line saves all of theDummyShip
objects to the database in a single query, which is more efficient than saving each object individually.self.stdout.write(self.style.SUCCESS('Data imported successfully'))
: If everything went well, this line will print a success message to the console.except Exception:
: If anything goes wrong during the import process (like if the file can't be opened, or if there's an error saving the objects to the database), the code in theexcept
block will be executed.self.stdout.write(self.style.ERROR('Error importing data'))
: If there was an error, this line will print an error message to the console.
Seed the data!
Now that we have prepared all the necessary code, the data that we want to import, lets create a dummy csv file for the sake of testing our custom command. Create importfile.csv on the root project folder write:
ID;ship_name;ship_total_vessels
0;jayakarta I;15
run our custom command:
python3 manage.py import_data importfile.csv
If you followed it correctly, you will get a successful message on the terminal:
Next, lets check on the admin page to ensure the data is imported
click on the jayakarta I and we will see the detail of the fields is match with the our csv file
We now have our custom command to autmate the data seeding so we don’t have to import it manually
Conclusion
Automation in data seeding and migration is truly a boon in the landscape of web development. It not only enhances efficiency but also ensures consistency and reliability in the development process. Utilizing tools like Django’s django-admin
command-line utility and the ability to create custom commands, developers can save valuable time, reduce errors, and streamline the entire process of database management.
Whether it’s for initializing test databases, migrating large amounts of data, or setting up production databases, automated data seeding can significantly simplify these tasks. By leveraging Django’s powerful and flexible features, developers can focus more on building robust and efficient applications, knowing that their data management needs are being handled effectively and consistently.
In summary, automatic data seeding is not just a good-to-have feature but a crucial aspect of modern web development practices. It’s a testament to the constant evolution and progress of development tools and methodologies, aimed at making the life of developers easier and their work more efficient. With Django at the helm, automatic data seeding and migration becomes a smoother, more manageable task, enabling you to deliver better quality applications in less time.