In the previous article, the Introduction, I gave an extensive breakdown of all the concepts and modules employed in this project. So it's pertinent that you check it out to get a full hang of what we will be doing in this episode. Also, be sure to adhere to the disclaimer.
With that said, I believe your sleeves are all rolled up again. So let's dive in.
We will write a base class to define a few common attributes, methods and behaviour for other classes. There are some common attributes that every instance of each class we write for this project must possess. attributes like an ID
that is uniquely identified (I explained the reason and implementation of a universally unique identity here), a dynamically created instance attribute: created_at
and updated_at
assigned with the current date and time stamps when an instance is created or modified.
All the attributes we will define in this base class must be public instance attributes so that they will be accessible by all subclasses of our base class.
from datetime import datetime
from uuid import uuid4
class MyBase:
def __init__(self):
# assign a few common default attributes to all instance of this class
self.id = str(uuid4())
self.created_at = datetime.now()
self.updated_at = datetime.now()
We will also define a method, save_update()
to update the updated_at
attribute with the current date and time whenever we make changes to the instance of our class. This will be useful in tracking when changes are made to an instance of our class.
from datetime import datetime
from uuid import uuid4
class MyBase:
def __init__(self):
# assign a few common default attributes to all instance of this class
self.id = str(uuid4())
self.created_at = datetime.now()
self.updated_at = datetime.now()
def save_update(self):
# update updated_at attribute with the current date and time
updated_at = datetime.now()
At first, before I wrote the save_update()
method, both created_at
and updated_at
had the same date and time stamp, which means that if we don't have a method/function to update the updated_at
attribute, whenever we update or modify any instance of our class, the updated_at
attribute will still have the same date and time as the created_at
attribute when the instance/object was first created.
Imagine creating an instance two days ago, and if you update an attribute of that instance today, the updated_at
attribute that is supposed to give you the current time and date at which it was updated still gives you the old date and time of two days ago when the instance was initially created. That's not proper, right?
So by defining a save_update()
method and calling it each time we modify or update an instance of our class, we can keep track of the date and time at which every instance was updated.
I will demonstrate this below by creating an instance, update the instance by assigning a name attribute to it, and then call the save_update()
method that will update our updated_at
attribute to correspond with the exact date and time at which that instance was modified or updated.
# instantiation (creating an object from our class)
my_model = MyBase()
# update the object and call save_update() method to get the current time
my_model.name = "Julien Kimba"
my_model.save_update()
Let's also define another public instance method, save_to_dict()
that returns a dictionary containing all key-value pairs of the __dict__
, which is where the instance attributes of the class are stored.
To access this dictionary or mapping object (__dict__
) and return its contents as a dictionary, we will use the self.__dict__
syntax to ensure we return a dictionary that includes only the instance attributes set for the specific class instance we want to access its attributes.
With this method save_to_dict()
, we consider it our first form of serialization/deserialization process, as it returns a representation of the object's state in a format (dictionary) that can be easily stored. and we can also reconstruct or recreate (deserialize) this representation back to its original form by parsing this serialized object (dictionary representation) into the class constructor.
I guess you quickly thought of json.dumps()
when I mentioned "serialization". It's worth knowing that JSON serialization is a specific type of serialization where the object is converted into a JSON-formatted string. but we are using our save_to_dict()
method to serialize our class instances or objects to a Python dictionary. Let me highlight a little difference between them:
json.dumps()
is a string in a standardized format that can be easily transferred over a saved file or network, which we will be doing later in this project, whereas the output ofsave_to_dict()
is a Python dictionary that we can easily manipulate and use within our Python program.json.dumps()
is typically used when you need to transfer or store data in a standardized format that can be easily consumed by other programs, while oursave_to_dict()
method is useful when we need to manipulate the data within our Python program, which are our class instance attributes in this case.Another difference is that
json.dumps()
can only serialize certain types of Python objects that are supported by the JSON format, whereas oursave_to_dict()
method can be implemented to serialize any type of Python object.
With that said, before we write or implement this method, save_to_dict()
. I want to also point out that we also need to add a __class__
key with the class name of the instance as its value. This is used to identify the class from which the instance attributes came from during the serialization process.
Let's implement this!
def save_to_dict(self):
dictFormat = {}
# add a class key to identify the class name of the instance attribute
dictFormat["__class__"] = self.__class__.__name__
dictFormat
is an empty dictionary that will hold or return the dictionary representation of our class instance attributes after the serialization, and the assignment dictFormat["__class__"] = self.__class__.__name__
is how a key-value paired value or data is assigned to a dictionary variable. So in dictionary form, dictForm
will have something like this: {'__class__': 'MyBass'}.
Now we convert our public instance attributes created_at
and updated_at
to string objects in ISO format. Remember, they are of the class type, datetime
object (I explained this in the previous article).
def save_to_dict(self):
dictFormat = {}
dictFormat["__class__"] = self.__class__.__name__
for key, val in self.__dict__.items():
# get the values which are of datetime object type
if isinstance(val, datetime):
#convert them to string objects in ISO format
dictFormat[key] = val.isoformat()
else:
dictFormat[key] = val
return dictFormat
self.__dict__
stores the content of our class instance attributes as I mentioned before, so we iterated over it to get the attribute whose value is of a datetime
object type, and you know it's only the value of created_at
and update_at
attributes that are of datetime
object type. I then converted the values (date and time stamp) to a string object in ISO format, then under the else
clause, I left the rest of the attributes in their original form.
Another way to do this is to get the created_at
and update_at
key in the dictionary by looping through the __dict__
object of your instance as we did above and convert the values to a string object in ISO format using any of the datetime
methods I mentioned previously.
You should figure out this alternative implementation yourself. that's learning!
It seems we forgot one important thing: a human-readable string representation of these objects. If we don't define and implement a string representation method, when we try to print or get the outputs of the things we've implemented thus far, it will be a non-human-readable string with the memory addresses of the objects. some like this: <__main__.MyBase object at 0x7f4a16aabdc0>
which doesn't really make sense or is not even useful to us.
Now let's implement a string representation that prints the class name of our instance, instance ID, and all the attributes of that instance in this format: [<class name>] (<self.id>) <self.__dict__>
def save_to_dict(self):
dictFormat = {}
dictFormat["__class__"] = self.__class__.__name__
for key, val in self.__dict__.items():
if isinstance(val, datetime):
dictFormat[key] = val.isoformat()
else:
dictFormat[key] = val
return dictFormat
def __str__(self):
# str representation format for all class instance and its attributes
return f"{[self.__class__.__name__]} {(self.id)} {self.__dict__}"
I've just included a __str__()
method that returns the human-readable string representation of our objects, formatted with f-string.
Let's put all we've done so far in one block, create a class instance, update it with a few attributes, and see what the output looks like.
from datetime import datetime
from uuid import uuid4
class MyBase:
def __init__(self):
# assign a few common default attributes to all instance of this class
self.id = str(uuid4())
self.created_at = datetime.now()
self.updated_at = datetime.now()
def save_update(self):
# update the update_at attribute with the current date and time
updated_at = datetime.now()
def save_to_dict(self):
# returns a dictionary containing all keys/values in __dict__ of
# all the instances
dictFormat = {}
dictFormat["__class__"] = self.__class__.__name__
for key, val in self.__dict__.items():
if isinstance(val, datetime):
dictFormat[key] = val.isoformat()
else:
dictFormat[key] = val
return dictFormat
def __str__(self):
# string representation of all class instance and its attributes
return f"{[self.__class__.__name__]} {(self.id)} {self.__dict__}"
# creating a class instance and update it with a few attributes
my_model = MyBase()
my_model.name = "Julien Kimba"
my_model.julien_age = 37
print(my_model)
my_model.save_update()
print("\n=== dictionary representatio ===\n")
my_model_json = my_model.save_to_dict()
print(my_model_json)
print("\nJSON of my_model:\n")
for key, value in my_model_json.items():
print("\t{}: ({}) - {}".format(key, type(value), value))
After the line where I defined the __str__()
method, the rest are just instantiations. I created an instance of our MyBase
class, added a few attributes to it, and printed the instance. I then called the save_update()
method to update the updated_at
attribute with the current date and time stamp. You can notice where I printed a separator line === dictionary representation below ===
to show where the outputs for our dictionary representation started from.
Next, I called the save_to_dict()
method on our new instance my_model
to create a dictionary representation of the instance. You can see I assigned the dictionary representation to a variable my_model_json
and printed it out. The next line is a separator text JSON of my_model
to show where the output starts from.
Finally, I iterated over the key-value pairs of the dictionary representation and printed each key, its type, and its value in a formatted string. and this gives a clear representation of the type and value of each attribute in the instance.
Below are the outputs. You can study them accordingly to understand how they were instantiated and printed so that you can get the hang of each flow. Another thing to observe in the dictionary representation output is that we now have a __class__
key with the name of the class as its value. You should see something like this '__class__': 'MyBase'
in the dictionary, You can recall when I talked about implementing this to identify the class from which the instance attributes originated from during the serialization process.
['MyBase'] c6083c8c-fdca-498c-969e-749ef8fa262c {'id': 'c6083c8c-fdca-498c-969e-749ef8fa262c', 'created_at': datetime.datetime(2023, 5, 7, 14, 7, 26, 744427), 'updated_at': datetime.datetime(2023, 5, 7, 14, 7, 26, 744433), 'name': 'Julien Kimba', 'julien_age': 37}
=== dictionary represetation below ===
{'__class__': 'MyBase', 'id': 'c6083c8c-fdca-498c-969e-749ef8fa262c', 'created_at': '2023-05-07T14:07:26.744427', 'updated_at': '2023-05-07T14:07:26.744433', 'name': 'Julien Kimba', 'julien_age': 37}
JSON of my_model:
__class__: (<class 'str'>) - MyBase
id: (<class 'str'>) - c6083c8c-fdca-498c-969e-749ef8fa262c
created_at: (<class 'str'>) - 2023-05-07T14:07:26.744427
updated_at: (<class 'str'>) - 2023-05-07T14:07:26.744433
name: (<class 'str'>) - Julien Kimba
julien_age: (<class 'int'>) - 37
We've created a method save_to_dict()
to generate a dictionary representation of an instance. Now it’s time to recreate an instance with this dictionary representation (deserialization).
Let's update our base class MyBase
by adding special parameters *args
and **kwargs
to the class constructor. These parameters will be used as arguments for the constructor of our base class, Mybase
. We wouldn't be using *args
for this project as I previously mentioned. but it is a common practice to have it inside the parameter there, although it wouldn't be doing any work for us.
If the **kwargs
dictionary is not empty, we are to create an attribute for each key of the dictionary, using the key as the attribute name and the corresponding value as the value of the attribute.
One important thing to note is that we wouldn't be creating an attribute using __class__
key and its value(class name). You can recall that we added a __class__
key to the dictionary representation to identify the class name of the serialized object when we were creating the save_to_dict()
method. but we don't need it anymore. We have to discard it.
Let's go ahead and implement these things I mentioned. I will only be writing the code for the things we are updating, and then later I will put them all together in one block.
# updating MyBase class constructor with special parameters
def __init__(self, *args, **kwargs):
# discard the __class__ key if dictionary is not empty
if kwargs:
del kwargs["__class__"]
Recall that the values of created_at
and update_at
are strings in this dictionary representation, we converted them to strings in our save_to_dict()
using isoformat(),
but inside the MyBase
instance, it's working with datetime
object. We have to convert these strings into datetime
object.
We will be using the strptime()
method to achieve this. It takes in a date and time value that is of str
object type and returns it as a datetime
object type. It accepts two arguments: the value to be converted to a datetime
object and the preferred format you want your date and time values to be in. One more good thing about this method is that it provides you with the capability of formatting the date and time in your own preferred format. (More details on how to use the strptime()
method here ).
As per the requirement for this project, we will be using this format: %Y-%m-%dT%H:%M:%S.%f
(ex: 2017-06-14T22:31:03.285259
).
# updating MyBase class constructor with special parameters
def __init__(self, *args, **kwargs):
# discard the __class__ key if dictionary is not empty
if kwargs:
del kwargs["__class__"]
for key, val in kwargs.items():
# grab date & time stamp keys and convert its values to datetime obj
if key == "created_at" or key == "updated_at":
dtime_obj = datetime.strptime(val, "%Y-%m-%dT%H:%M:%S.%f")
setattr(self, key, dtime_obj)
else:
setattr(self, key, val)
We created instance attributes by setting each key to its value from the dictionary but for the created_at
and updated_at
keys, we first converted their values to a datetime
object type and then set them as a key-value pair to their instance.
The aim is to regenerate an instance from the dictionary representation parsed in, and we've just done that.
But with the exception of this, when we don't have a dictionary to handle or when the parameter **kwargs
is not receiving any arguments (if kwargs
is empty), the constructor should be called to create a new instance of the class with the default instance attributes (ID
, created_at
and updated_at
) like we did in the beginning.
This is just a very simple one. From the last code we just wrote, since we have our first opening if
statement (if kwargs:
) that checks if kwargs
is NOT empty, this means that any other condition will only be evaluated and executed when this if
statement is false (if kwargs
is empty), If that is the case, the code block under this if
statement will not be executed, and program control will move to the else
clause and execute the code in it.
Since we want our class constructor to create a new instance when it is not handling any dictionary (if kwargs
is empty), then our else statement will simply be to just assign our default instance attributes, ID,
created_at
and update_at
. Remember that our ID
is unique (Two instances cannot share the same ID
). Meaning that at every first assignment of the ID
attribute to an instance is an instantiation (creation of a new class instance).
So, let's create or assign ID,
created_at
and updated_at
(new instance).
# updating MyBase constructor with special parameters
def __init__(self, *args, **kwargs):
# discard the __class__ key if dictionary is not empty
if kwargs:
del kwargs["__class__"]
for key, val in kwargs.items():
# grab date & time stamp keys and convert its values to datetime obj
if key == "created_at" or key == "updated_at":
dtime_obj = datetime.strptime(val, "%Y-%m-%dT%H:%M:%S.%f")
setattr(self, key, dtime_obj)
else:
setattr(self, key, val)
else:
# create or assign ID, created_at and updated_at (new instance)
self.id = str(uuid4())
self.created_at = datetime.now()
self.updated_at = datetime.now()
Let's put all these implementations in one block and test-run it with a few inputs.
from datetime import datetime
from uuid import uuid4
class MyBase:
# updating MyBase class constructor with special parameters
def __init__(self, *args, **kwargs):
# discard the __class__ key if dictionary is not empty
if kwargs:
del kwargs["__class__"]
for key, val in kwargs.items():
# grab date & time stamp keys and convert its values to datetime obj
if key == "created_at" or key == "updated_at":
dtime = datetime.strptime(val, "%Y-%m-%dT%H:%M:%S.%f")
setattr(self, key, dtime)
else:
setattr(self, key, val)
else:
# create/assign ID, created_at & updated_at (new instance)
self.id = str(uuid4())
self.created_at = datetime.now()
self.updated_at = datetime.now()
def save_update(self):
# update the updated_at attribute with the current date & time
updated_at = datetime.now()
def save_to_dict(self):
# returns a dictionary containing all keys/values of __dict__ of
# all the instance
dictFormat = {}
dictFormat["__class__"] = self.__class__.__name__
for key, val in self.__dict__.items():
if isinstance(val, datetime):
dictFormat[key] = val.isoformat()
else:
dictFormat[key] = val
return dictFormat
def __str__(self):
# string representation of all class instance and its attributes
return f"{[self.__class__.__name__]} {(self.id)} {self.__dict__}"
See the test inputs below.
# creating a class instance and update it with a few attributes
my_model = MyBase()
my_model.name = "Julien Kimba"
my_model.julien_age = 37
print(my_model.id)
print(my_model)
print(my_model.created_at)
print("\n------- Serialized to dictionary below -------\n")
my_model_json = my_model.save_to_dict()
print(my_model_json)
print("\n------ JSON of <'my_model'> below ------\n")
for key, value in my_model_json.items():
print("\t{}: ({}) - {}".format(key, type(value), value))
print("\n ------ deserialized to an instance below ------\n")
#parse in the dictionary to the class constructor for deserialization
new_model = MyBase(**my_model_json)
print(new_model.id)
print(new_model)
print(new_model.created_at)
print("\n------ the two instanceses compared below -----\n")
print(new_model is my_model)
I created an instance of our class MyBase
, and set some attributes on the instance including ID
and created_at
which were dynamically assigned by default at instantiation. Then I serialized it to a dictionary using the save_to_dict
method of the class. I printed it all out, with the next line printing the serialized dictionary, along with its data types and its content.
Next is the opposite of what was previously done. (deserialization). I created a new instance of MyBase
class by deserializing the dictionary we got from the previous dictionary using the dictionary unpacking operator **
. Finally, I printed out the attributes of the new instance, and compared it with the original instance, which evaluates to false
. It evaluates to False
because even though the attributes of the two instances new_model
and my_model
are identical, they are two different instances of the same class.
Here is the output.
55c7bea4-2013-4dbd-b0f2-980b8edeec63
['MyBase'] 55c7bea4-2013-4dbd-b0f2-980b8edeec63 {'id': '55c7bea4-2013-4dbd-b0f2-980b8edeec63', 'created_at': datetime.datetime(2023, 5, 8, 14, 10, 2, 357567), 'updated_at': datetime.datetime(2023, 5, 8, 14, 10, 2, 357572), 'name': 'Julien Kimba', 'julien_age': 37}
2023-05-08 14:10:02.357567
------- Serialized to dictionary below -------
{'__class__': 'MyBase', 'id': '55c7bea4-2013-4dbd-b0f2-980b8edeec63', 'created_at': '2023-05-08T14:10:02.357567', 'updated_at': '2023-05-08T14:10:02.357572', 'name': 'Julien Kimba', 'julien_age': 37}
------ JSON of <'my_model'> below ------
__class__: (<class 'str'>) - MyBase
id: (<class 'str'>) - 55c7bea4-2013-4dbd-b0f2-980b8edeec63
created_at: (<class 'str'>) - 2023-05-08T14:10:02.357567
updated_at: (<class 'str'>) - 2023-05-08T14:10:02.357572
name: (<class 'str'>) - Julien Kimba
julien_age: (<class 'int'>) - 37
------ deserialized to an instance below -----
55c7bea4-2013-4dbd-b0f2-980b8edeec63
['MyBase'] 55c7bea4-2013-4dbd-b0f2-980b8edeec63 {'id': '55c7bea4-2013-4dbd-b0f2-980b8edeec63', 'created_at': datetime.datetime(2023, 5, 8, 14, 10, 2, 357567), 'updated_at': datetime.datetime(2023, 5, 8, 14, 10, 2, 357572), 'name': 'Julien Kimba', 'julien_age': 37}
2023-05-08 14:10:02.357567
------- the two instanceses compared below -----
False
Examine the outputs accordingly to understand how they were instantiated and printed so that you can get the hang of each flow.
The IDs and timestamps are all the same because we used the dictionary generated from deserializing our first instance to create a new instance. so the values should be the same because we only implemented and demonstrated the first piece of the serialization/deserialization process.
This is where I serialized our first instance: my_model_json = my_model.save_to_dict()
and this is where I used the serialized (dictionary) representation to create a new instance: new_model = MyBase(**my_model_json)
.
Conclusion.
Congratulations to us! We've successfully written the code to implement the first piece of the serialization/deserialization process, where we created a method to generate a dictionary representation of an instance, and also recreate an instance with this dictionary representation. but it is not persistent yet.
In the next episode, we will be converting the dictionary representation to a standard representation of a data structure (JSON), which is readable and writeable across other programming languages and platforms.
Check out part 3 via the below link:
Thanks for reading!
This is where I will call it a wrap for today's episode because I wouldn't want this article to become bulky. My sole aim is for anyone seeking guidelines from this article to completely understand the underlying structure and flow of each implementation.
I will duly appreciate your feedback in the comment box, and you can follow my blog to stay tuned each time I upload the next part of this series. Also, feel free to reach out to me (Email, Twitter, LinkedIn) if you are stuck and need my assistance on this project.