Marshalling Python Dataclasses
Published on
Recently I wanted a way to transfer structured messages between two python applications over a unix domain socket. The cleanest and simplest way I found so far is to make use of the dataclasses and json standard libraries.
We’ll consider the following message for the rest of the post:
from dataclasses import dataclass
@dataclass
class QueryUserMessage:
auth_key: str
username: str
Marshalling
Let’s say we have a message we want to send:
message = QueryUserMessage("lkajdfsas", "brozek")
We first need to get its dictionary representation. Luckily the standard library has us there:
from dataclasses import asdict
message_dict = asdict(message)
Then we can use the json
module to give us a string representation
import json
message_str = json.dumps(message_dict)
Finally, we can encode it into bytes and send it away:
# Default encoding is "utf-8"
message_bytes = message_str.encode()
# Assuming connetion is defined...
connection.sendall(message_bytes)
To make this easier for myself, I create a custom json
encoder and a function that uses the connection to send off the message
class DataclassEncoder(json.JSONEncoder):
def default(self, o):
return asdict(o)
def send_message(connection, message_dataclass):
contents = json.dumps(message_dataclass, cls=DataclassEncoder).encode()
connection.sendall(contents)
Un-marshalling
On the other end, let us receive the bytes and decode it into a string:
MESSAGE_BUFFER_LEN = 1024
message_bytes = connection.recv(MESSAGE_BUFFER_LEN)
message_str = message_bytes.decode()
We can use the json
module to turn it into a Python dictionary
message_dict = json.loads(message_str)
In this post, we can make use of the fact that we only have one message class. In other cases, you would either want to rely on some protocol or pass in the message type ahead of time. Therefore, we can pass the fields of the dictionary straight to the constructor.
message = QueryUserMessage(**message_dict)
Conclusion
In production use cases, we’ll need to introduce a gambit of error-handling to capture failures in json de-serialization and class instantiation. I hope, however, that this serves as a good starting point.
Some things to consider:
- If you have multiple types of messages, maybe including a field in the dictionary that is a string which represents the message type. Both applications can then maintain a map between these strings and the class constructors.
- If it’s possible to have messages larger than the buffer length, then consider either setting it higher or sending the size of the message beforehand.
- Using a standard HTTP library ;)