Skip to main content

Dependency Injection explained

injection

Dependency Injection often abbreviated as DI is a design pattern/programming technique or simply a term thrown around a lot in software development lingo. When I first encountered this term, I didn’t understand what it meant since it seemed to mean something complicated. To my surprise it is just a fancy term representing a simple concept. Before going any further let’s lay a groundwork and define some terms.

Service
any class that contains some useful functionality.

Dependency
a service (any class ) that is used by another class or function.
Let’s say we have a web server with two classes: authentication class and database management class. Users can make a request to our web server to add new data or delete some data. On the server side, database management class is responsible for connecting to database and modifying the data but it relies on the authentication class to check if the user who is making the request is authenticated and has necessary privileges. For database management class, authentication class is a dependency because database management class depends on authentication class to to do its job.

Client
a class that uses service (another class) as its dependency.
From our above web server example, database management class is considered a client. A client can be a dependency to another class.

Code example:

class AuthenticateUser:

    # some other code

    def isAuthenticated(self, request):
        return request.user != "AnonymousUser"


class SomeDatabaseConnector:
     # some other code

    def add_data(self, user, data_to_enter):
        # logic to connect to db and record the data
        return "data entered successfully"

class Database_Management:
    def __init__():
        self.authenticator = AuthenticateUser()
        self.database = SomeDatabaseConnector()

    # some other code

    def add_data(self, request, data):
        if self.authenticator.isAuthenticated(request):
            self.database.create(user=request.user, data_to_enter=data)
        else:
            return "User is not authenticated"

db = Database_Management()
# we don't dive deep into how we got request and user.
# it is usually provided by the web framework/library.
db.add_data(request, user)

What is Dependency Injection

Now we have some idea about what dependencies and clients are. It is time to define “Dependency Injection”. Dependency injection is passing an already created object as an argument to another function/class instead of creating it in the body of that client class/function. If my definition didn’t make much sense, here is the alternative definitions by Wikipedia “Dependency injection is a programming technique in which an object or function receives other objects or functions that it requires, as opposed to creating them internally”.

One example is worth dozens of definitions, isn’t it? I hear you. So below is the example we used earlier this time with dependencies being injected.

class AuthenticateUser:

    # some other code

    def isAuthenticated(self, request):
        return request.user != "AnonymousUser"


class SomeDatabaseConnector:
     # some other code

    def add_data(self, user, data_to_enter):
        # logic to connect to db and record the data
        return "data entered successfully"

class Database_Management:
    def __init__(authentication_class, database_connector_class):
        self.authenticator = authentication_class
        self.database = database_connector_class

    # some other code

    def add_data(self, request, data):
        if self.authenticator.isAuthenticated(request):
            self.database.create(user=request.user, data_to_enter=data)
        else:
            return "User is not authenticated"


auth_class = AuthenticateUser()
db_connector_class = SomeDatabaseConnector()

# Pay attention here
# we are INJECTING (passing) dependencies to the client Database_Management class
db = Database_Management(auth_class, db_connector_class)
# we don't dive deep into how we got request and user.
# it is usually provided by the web framework/library.
db.add_data(request, user)

You might ask what is the purpose of injecting dependencies in this way. After all with the above example it doesn’t look that impressive, right? What is the difference between initializing the dependencies inside the client versus passing them as arguments to the client?

Above example illustrates mechanics of dependency injection but not its benefits. Now let’s talk about its benefits. Primary benefit is to keep various functions of a program loosely coupled. As this StackOverflow answer excellently points out “the objects change more frequently then the code that uses them”. If not loosely coupled, changes in one part of the program requires modification in multiple places. On the other hand, if loosely coupled, changes in one part of the program requires ideally no modification in other parts of the program. In this regard, by injecting dependencies, i.e, passing already initialized objects as an argument rather than creating them internally, we can keep the creation and usage of the object separate. In this way client function/class doesn’t need to know how to create the object or even which object it is using, it only needs to know how to use it. As long as you don’t change the methods and fields of the object, your program continues to work without breaking even if you swap the dependencies or change the parameters of those dependencies. Dependency Injection also allows sharing state among client classes.

As an example let’s imagine an app that allows users to set profile photos. Our app uses ASW S3 bucket (storage) to store user photos.

class S3:
    def __init__():
        # AWS S3 specific code such as
        # using boto3, connect to s3 bucket

    def upload(self, data):
        # logic to upload the data
        return "link to the uploaded data"

class UserProfile:

    def __init__(cloud_storage):
        self.storage = cloud_storage

    # other code
    def profile_photo(self, user, photo):
        link = self.storage.upload(photo)
        # save the link to the database that points to this user
        return "successfully uploaded profile photo"

s3_bucket = S3()
user = UserProfile(s3_bucket)

user.profile_photo(request.user, photo)

After sometime we found out that Google Cloud offered cheaper storage solution. We decided to use Google Cloud Storage instead of AWS S3 to store new user profile photos.

class Google_Storage:
    def __init__():
        # Google Storage specific code such as
        # different API and interface to connect to GCP storage

    def upload(self, data):
        # logic to upload the data
        return "link to the uploaded data"

class UserProfile:

    def __init__(cloud_storage):
        self.storage = cloud_storage

    # other code
    def profile_photo(self, user, photo):
        link = self.storage.upload(photo)
        # save the link to the database that points to this user
        return "successfully uploaded profile photo"

gcp_bucket = Google_Storage()
user = UserProfile(gcp_bucket)

user.profile_photo(request.user, photo)

As long as the storage service has the upload method that takes data as its argument and returns a link to uploaded data, UserProfile class does not care, nor need to know which class it is using (whether S3 class or Google_Storage class). This comes handy in testing too. We can easily swap the dependency services with mocks to test the client.

Note: In real life entire logic of the application is not usually defined in single file. Besides, the code is much longer than the above examples.

Let’s see another example in which client class is not concerned about how to initialize the dependency.

class StorageClass:
    # hardcoded fields
    _cloud_provider = "AWS"
    _storage = "S3"
    _bucket_name = "my_bucket"

    # some other logic

    def upload(self, data):
        # connect to database
        # save the data
        return link_to_uploaded_data

    def delete(self, data_id):
        # connect to database
        # delete the data
        return "successfully deleted"


class FreemiumUser:

    def __init__(self, storage_class):
        self.storage = storage_class

    def replace_profile_photo(self, request, photo):
        # delete existing photo
        # compress the new photo and
        # upload it using the storage dependency class
        self.storage.delete(old_photo_id)
        self.storage.upload(new_compressed_photo)


class PremiumUsers:

    def __init__(self, storage_class):
        self.storage = storage_class

    # some other functionalities

    def add_profile_photo(self, request, photo):
        # instead of deleting existing photo
        # allow user to have more than one profile photo
        self.storage.upload(photo)

dependency = StorageClass()
freemium_users = FreemiumUser(dependency)
freemium_users.replace_profile_photo(request, photo)

premium_user = PremiumUser(dependency)
premium_user.add_profile_photo(request, photo)

In above code, we have a StorageClass dependency that is being used by FreemiumUser and PremiumUsers clients. Imagine it is a big application and several developers are responsible for different parts of the application. Web development team is among others responsible for FreemiumUser and PremiumUsers classes. You are responsible for StorageClass class.

In a hurry you hardcoded StorageClass fields. You know that it would be much better to change them to parameters. Since your team is using Dependency Injection and other classes only rely on methods not , you can easily change the StorageClass without impacting depending classes namely, FreemiumUser and PremiumUsers classes.

class StorageClass:

    def __init(self, cloud_provide, storage_name, bucket_name)
        self.cloud_provider = cloud_provide
        self.storage = storage_name
        self.bucket_name = bucket_name

    # some other logic

    def upload(self, data):
        # connect to database
        # save the data
        return link_to_uploaded_data

    def delete(self, data_id):
        # connect to database
        # delete the data
        return "successfully deleted"


class FreemiumUser:

    def __init__(self, storage_class):
        self.storage = storage_class

    def replace_profile_photo(self, request, photo):
        # delete existing photo
        # compress the new photo and
        # upload it using the storage dependency class
        self.storage.delete(old_photo_id)
        self.storage.upload(new_compressed_photo)


class PremiumUsers:

    def __init__(self, storage_class):
        self.storage = storage_class

    # some other functionalities

    def add_profile_photo(self, request, photo):
        # instead of deleting existing photo
        # allow user to have more than one profile photo
        self.storage.upload(photo)

dependency = StorageClass(
    cloud_provide="AWS",
    storage_name="S3",
    bucket_name="my_bucket"
)
freemium_users = FreemiumUser(dependency)
freemium_users.replace_profile_photo(request, photo)

premium_user = PremiumUser(dependency)
premium_user.add_profile_photo(request, photo)

If you were NOT using dependency injection, i.e initializing StorageClass class inside client classes, the change would break FreemiumUser and PremiumUsers classes and you had to ask the maintainers of those classes to update the classes.

Another benefit of dependency injection is sharing state among clients. It is similar to singleton concept

class DependencyClass {
    constructor() {
        this.isDataReady = false;
        this.data = null;
    }

    processData(data) {
        // Process data logic
        this.data = processed_data;
        return "done"
    }

    deleteAll() {
        // Delete all data
        this.data = null;
    }
}

class Client1 {
    constructor(dependencyClass) {
        this.dependency = dependencyClass;
    }

    async fetchData() {
        // Asynchronously fetch data
        const fetchedData = await this.FetchData();
        this.dependency.processData(fetchedData);
        this.dependency.isDataReady = true;
    }

}

class Client2 {
    constructor(dependencyClass) {
        this.dependency = dependencyClass;
    }

    consumeData() {
        // Check dependency.isDataReady every 10 seconds
        this.intervalId = setInterval(() => {
            if (this.dependency.isDataReady) {

                // use the the this.dependency.data

                // Stop checking once data is ready and consumed
                clearInterval(this.intervalId);
            }
        }, 10000); // 10 seconds
    }

    deleteData() {
        this.dependency.deleteAll();
        this.dependency.isDataReady = false;
    }
}

const dependency = new DependencyClass();
const client1 = new Client1(dependency);
const client2 = new Client2(dependency);

client1.fetchData();
client2.consumeData();

In above code, Client1 and Client2 classes use DependencyClass’s is_data_ready and data fields to share state, i.e to let interested parties know the state whether data is ready or not. Client1 and Client2 classes do not need to know each other’s existence. They only communicate with DependencyClass class. Client1 class is not concerned whether the data it fetches and assigns to DependencyClass class’s field is used by one class or ten different classes. Likewise Client2 class is not concerned with which class fetches data and how. It only communicates with DependencyClass class.

Note: you inject dependencies not only in constructors of the classes, but also via setter methods, and interfaces.

Example of setter injection:

class Service:
    def do_something(self):
        print("Doing something in the service...")

class Client:
    def __init__(self):
        self._service = None

    def set_service(self, service: Service):
        """Injects the service dependency."""
        self._service = service

    def do_something_in_client(self):
        if self._service is not None:
            self._service.do_something()
        else:
            print("Service not injected!")

client = Client()
service = Service()
client.set_service(service)
client.do_something_in_client()

The point is, you don’t necessarily need to inject dependency at the class (client) initialization. Using setter methods, you can inject dependencies after initializing the client class.


Bonus section

Up until now we used python in our examples. Dependency injection is a concept/technique that can be implemented in almost all languages. Some languages have some nice shortcuts that make dependency injection even more concise. One of such languages is Typescript.

Explicit way:

class DependencyClass(){
// class logic
}


class Client(){

    private dep: DependencyClass;

    constructor(Dependency: DependencyClass){
        this.dep = Dependency
    }
}

let dependency = new DependencyClass()
let client = new Client(dependency)

In this version, the constructor receives an instance of DependencyClass as an argument (Dependency).
The argument (Dependency) is then assigned to the dep property of the Client class.
This approach explicitly defines the dep property in Client class and assigns the parameter to it in the constructor.

Shorthand:

class DependencyClass(){
// class logic
}


class Client(){

    constructor(private dep: DependencyClass)
    { }
}

let dependency = new DependencyClass()
let client = new Client(dependency)

This version uses TypeScript’s shorthand syntax for property declarations.
By adding the private keyword in the constructor parameter (private dep: DependencyClass), TypeScript automatically creates a private class property named dep and assigns the constructor argument to it.
There’s no need to explicitly declare and assign dep in the class body; TypeScript handles both in one step.


Conclusion

Dependency Injection is a simple concept: passing a class object to another class as an argument. It is main benefit of separation of concerns such as dependency initialization and dependency usage. It becomes useful when you need to test your code or modify some parts of your code since it allows loose coupling.

Comments

Popular posts from this blog

Introduction to SQLFluff: How to make your SQL code clean and error-free

Image by Jake Aldridge from Pixabay You know oftentimes, the cause of runtime or compile errors and hours of debugging agony is all due to simply a missing semicolon. Have you ever had such experience? If you had, you are not alone. There are two ways to avoid these unfortunate situations: either become a perfect developer who never makes mistakes, or use helpful tools such as linters that can catch these errors early on. I am nowhere near being a perfect developer who never makes a mistake. In fact, I'm probably the opposite of a perfect developer, so even if I wanted to, I wouldn’t be able to teach you how to become a perfect developer. But what I can teach you is using linters. A Wikipedia defines a linter as a "static code analysis tool used to flag programming errors, bugs, stylistic errors and suspicious constructs." If you're not convinced yet on using linters, consider this scenario: in a large project with multiple members, different people tend to ...

How To Use KeePassXC Cli

There are similarly named programs: KeePass, KeePassX and KeePassXC (many of which are each others’ forks). Program Condition KeePass primarily for Windows. KeePassX no longer actively maintained. KeePassXC actively maintained and runs natively on Linux, macOS and Windows . Note: GUI version of the KeePassXC has more features than cli version. GUI version has variety of shortcuts as well. Regarding how to use GUI version of the KeePassXC, visit Getting Started Guide . Below features are available only in GUI version. Setting “Name” and “Description” fields of passwords database. Nesting Groups. Creating entry attributes ( open issue ). Adding Timed One-Time Passwords (TOTP). Adding entry with the same title as existing entry. KeePassXC stores all the passwords in passwords database. A passwords database (hereafter referred to as database) is an (encrypted) binary file. It can have any or no extension, but the .kdbx extension is commonly used. The ...

Git squash merge explained

There are many ways to integrate changes in git: regular / normal git merge, git squash merge, git rebase etc. This article explains git squash merge by comparing it to regular merge. Let’s use below example: In the repository with default main branch, after two commits, a new feature branch is created. Some work happened in feature branch. feature branch now has 2 commits that it shares with main branch, and three exclusive commits (exists only in feature branch). In the meantime, others worked on main branch and added two new commits (exists only in main branch). git log output of the main branch: c72d4a9 ( HEAD - > main ) fourth commit on main 2c3dd61 third commit on main 0c2eec3 second commit on main 9b968e8 first commit on main git log output of the feature branch: 786650f ( HEAD - > feature ) third commit on feature 21cbaf1 second commit on feature 677bc7f first commit on feature 0c2eec3 second commit on main 9b968e8 first commit on mai...

例を使ってSnowflakeストアドプロシージャを学びましょう

Image by Gerd Altmann from Pixabay データベースの操作において、反復的なタスクや複雑なロジックの実行は、時間と労力を要する作業になりがちです。Snowflakeストアドプロシージャは、こうした課題を解決するための強力な機能であり、SQLクエリを拡張して、より効率的かつ安全なデータ処理を実現します。 本稿では、Snowflakeストアドプロシージャの基本的な概念から、JavaScript、Python、そしてSnowflake Scripting (SQL)といった複数のプログラミング言語を使った作成方法、さらにはセキュリティ対策まで、実践的な知識を提供します。 小売業におけるキャンペーン管理を例に、県名に応じてキャンペーン情報と割引率を一括更新するストアドプロシージャを実装します。 ストアドプロシージャと言うのは ストアドプロシージャを関数の一つ種類と考えてもいいです。ストアドプロシージャを記述して、 SQL を実行する手続き型コードでシステムを拡張できます。ストアドプロシージャを作成すると、何度でも再利用できます。 値を明示的に返すことが許可されていますが、必須ではないです。ストアドプロシージャを実行するロールの権限だけではなく、プロシージャを所有するロールの権限でも実行出来ます。 サポートされている言語: Java JavaScript Python Scala Snowflake Scripting (SQL) ストアドプロシージャの形: CREATE OR REPLACE PROCEDURE プロシージャ名(arguments argumentsのタイプ) RETURNS レターんタイプ LANGUAGE 言語 -- (例:python, JavaScript等) -- RUNTIME_VERSION = '3.8' (言語がpython, java, scalaなら必要 ) -- PACKAGES = ('snowflake-snowpark-python') (言語がpython, java, scalaなら必要 ) -- HANDLER = 'run' (言語がpython, java, scalaなら必要 ) EXECUTE AS ...

Snowflake Load History vs Copy History: 7 differences

Image by Icons8_team from Pixabay Tracking data loads in Snowflake is crucial to maintaining data health and performance. Load History and Copy History are features that provide valuable information about past data loads. Understanding these features can help you efficiently troubleshoot, audit, and analyze performance. You might be wondering why two functions exist to achieve the same goal, what are the differences, which one I am supposed to use and when? In this article we will provide you with all the answers. So, let's learn what are the differences and when to use which! Load History vs Copy History: 7 differences Differences 1 and 2: views vs table function and Account Usage vs information Schema Here things get little confusing, bare with me, there are two Load History views, a view that belongs to Information Schema and a view that belongs to Account Usage schema . As for Copy History, there are Copy History table function of Information schema and a Copy H...

WinMerge のセットアップと使う方

WinMerge は、Windows 用のオープン ソースの差分およびマージ ツールです。WinMerge は、フォルダーとファイルの両方を比較し、違いを理解して扱いやすい視覚的なテキスト形式で表示します。この記事でWinMerge のセットアップと使う方を教えます。 source: https://winmerge.org WinMerge をダウンロード WinMerge のウェブサイト に行って、「WinMerge-2.16.44-x64-Setup.exe」ボタンを押し、WinMerge 2.16 をダウンロードしてください。 WinMerge をインストール ダウンロードされたソフトウェアをクリックし、ポップアップ画面で「Next」を押してください 「Languages」部分をスクロールダウンし、「Japanese menus and dialogs」を選択し、「Next」ボタンを押してください ターミナル等からも WinMerge をアクセス出来ようにする為に「Add WinMerge folder to your system path」オプションを選択し、希望によって他のオプション選択してください 「Enable Explorer context menu Integration」オプションを選択したら、フォルダ/ファイルを右キリックし、コンテクストメニューから WinMerge を開くようになります。 「Install」ボタンを押し、「Next」ボタンを押し、その後、「Finish」ボタンを押してください 言語を日本語にする もし WinMerge の言語が日本語じゃなくて、英語なら、「Edit」タブから「Options」を押してください。 ポップアップ画面で右側の下にある「Languages」と言うドロップダウンメニューから日本語を選択し、「OK」ボタンを押してください WinMerge を使う方 「ファイル」タッブから「開く」を押し 参照ボタンを押し、比較したいフォルダ・ファイルを指定 比較したいフォルダを指定する方法: ポップアップ画面から対象のフォルダーを選択し、「Open」を押してくだい 何も選択しないで、「Open」を押してくだい 右側下にある「比較」ボタンを押し ...

From Generic to Genius: Fine-tuning LLMs for Superior Accuracy in Snowflake

TL;DR: Cortex Fine-tuning is a fully managed service that lets you fine-tune popular LLMs using your data, all within Snowflake. While large language models (LLMs) are revolutionizing various fields, their "out-of-the-box" capabilities might not always align perfectly with your specific needs. This is where the power of fine-tuning comes into play. As it will be explained in this article, this feature empowers you to take a base LLM and customize it to excel in your particular domain. Here's the brief summary of why you might want to leverage Snowflake's fine-tuning capabilities: Unlocking Domain Expertise : Pre-trained LLMs are trained on massive, general datasets. Fine-tuning allows you to build upon this foundation and train the LLM further using data specific to your field, such as legal documents, medical records, or financial data. This empowers the LLM to understand complex terminology and patterns unique to your domain, leading to more accurate a...

脱初心者! Git ワークフローを理解して開発効率アップ

Git – チーム開発に必須のバージョン管理システムですが、その真価を発揮するにはワークフローの理解が欠かせません。 色々な人は Git の使い方を良く知っていますが、Git を仕事やワークフローに統合する方法を余り良く知らない人もいます。本記事では、Git をワークフローに組み込むことで、開発プロセスがどのように効率化され、チーム全体のパフォーマンスが向上するのかを解説します。Centralized Workflow から Forking Workflow まで、代表的な 9 つのワークフローの特徴を分かりやすく紹介します。それぞれのメリット・デメリット、そして最適なユースケースを理解することで、あなたのプロジェクトに最適なワークフローを選択し、開発をスムーズに進めましょう! Centralized Workflow Feature branching/GitHub Flow Trunk Based Flow Git Feature Flow Git Flow Enhanced Git Flow One Flow GitLab Flow Forking Workflow 分かりやすくするために、同じコンセプトを説明するに一つ以上の図を使った場合があります。 Centralized Workflow 説明: 集中化ワークフローではプロジェクトにおけるすべての変更の単一の入力箇所として中央リポジトリを使用します。デフォルトの開発用ブランチは main と呼ばれ、すべての変更がこのブランチにコミットされます。 集中化ワークフローでは main 以外のブランチは不要です。チームメンバー全員がひとつのブランチで作業し、変更を直接中央リポジトリにプッシュします。 メリット: SVN のような集中型バージョン管理システムから移行する小規模チームに最適。 デメリット: お互いのコードが邪魔になり (お互いの変更を上書きするように)、プロダクション環境にバグをい入れる可能性が高くて、複数のメンバいるチームでこのフローを使いにくい。 地図: graph TD; A[Central Repository] -->|Clone| B1[Developer A's Local Repo] A --...

Streamline Your Workflow: Send Snowflake Alerts to Slack

Do you know integrating Snowflake and Slack can make your life as a data engineer much easier?  Here's why: Real-time error catching and debugging : Instead of constantly checking logs for errors, you can set up Snowflake to automatically ping you in a Slack channel when something goes wrong. This is like having a dedicated assistant who watches for errors in your code and immediately lets you know so you can fix them faster. This is achieved through the use of webhooks, which are essentially automated HTTP requests that Snowflake sends to Slack when triggered by an event. Keep everyone in the loop : Slack integration also means you can keep your entire team informed about the status of data pipelines and other processes. You can configure Snowflake to send notifications to a shared channel whenever a pipeline completes, fails, or encounters an issue. This keeps everyone on the same page and avoids unnecessary status update meetings. This integration turns Slack ...

ワークフローの合理化:Snowflake アラートを Slack に送信する

Snowflake と Slack の連携でデータエンジニアの仕事がはかどる! Snowflake と Slack を連携させると、データエンジニアの業務効率が大幅に向上します。 その理由を見ていきましょう。 リアルタイムなエラー検知とデバッグ : これまでのようにログを常に監視しなくても、Snowflake でエラーが発生した場合に Slack チャンネルに自動通知を送信するように設定できます。これは、コードのエラーを監視し、すぐに知らせてくれる専任のアシスタントがいるようなものです。迅速な問題解決が可能になります。これは、Webhook を使用することで実現します。Webhook とは、イベントをトリガーとして Snowflake から Slack に送信される自動 HTTP リクエストです。 チーム全体の情報共有 : Slack 連携により、データパイプラインなどの処理状況をチーム全体に共有することもできます。パイプラインが完了、失敗、または問題が発生した場合に、共有チャンネルに通知を送信するように Snowflake を設定できます。これにより、全員が状況を把握できるようになり、不要な進捗確認会議を減らすことができます。 この連携により、Slack がデータワークフロー管理の強力なハブとなり、データエンジニアの業務をよりシンプルかつ生産的にします。 SnowflakeとSlackを統合する方法 Snowflake の NOTIFICATION INTEGRATION と言う機能と Slack の Webhook を利用し Snowflake から Slack にメッセージを送る事が出来ます。 同じ方法を使って、Snowflake から Microsoft Teams と PagerDuty にもメッセージを送るのは可能です。 Slack の使う方の概要が必要なら、この 動画 を見てください。 Webhook は http リクエストです。これは API 呼び出しで、リバース API または Push API と呼ばれることもあります。Webhook の特徴は、何らかのイベントが発生したときにクライアントがサーバーにコールバック (http リクエストを送信) するように指示することです。...