Skip to main content

Converting an existing project to use dg

info

dg and Dagster Components are under active development. You may encounter feature gaps, and the APIs may change. To report issues or give feedback, please join the #dg-components channel in the Dagster Community Slack.

Suppose we have an existing Dagster project. Our project defines a Python package with a a single Dagster asset. The asset is exposed in a top-level Definitions object in my_existing_project/definitions.py. We'll consider both a case where we have been using uv with pyproject.toml and pip with setup.py.

tree
.
├── my_existing_project
│   ├── __init__.py
│   ├── assets.py
│   ├── definitions.py
│   └── py.typed
├── pyproject.toml
└── uv.lock

2 directories, 6 files

dg needs to be able to resolve a Python environment for your project. This environment must include an installation of your project package. By default, a project's environment will resolve to whatever virtual environment is currently activated in the shell, or system Python if no virtual environment is activated.

Before proceeding, we'll make sure we have an activated and up-to-date virtual environment in the project root. Having the virtual environment located in the project root is recommended (particularly when using uv) but not required.

If you don't have a virtual environment yet, run:

uv sync

Then activate it:

source .venv/bin/activate

Install dependencies

Install the dg command line tool

We'll install dg globally as a uv tool:

uv tool install dagster-dg

This installs dg into a hidden, isolated Python environment separate from your project virtual environment. The dg executable is always available in the user's $PATH, regardless of any virtual environment activation in the shell. This is the recommended way to work with dg if you are using uv.

Update project structure

Add dg configuration

The dg command recognizes Dagster projects through the presence of TOML configuration. This may be either a pyproject.toml file with a tool.dg section or a dg.toml file. Let's add this configuration:

Since our project already has a pyproject.toml file, we can just add the requisite tool.dg section to the file:

pyproject.toml
...
[tool.dg]
directory_type = "project"

[tool.dg.project]
root_module = "my_existing_project"
code_location_target_module = "my_existing_project.definitions"

There are three settings:

  • directory_type = "project": This is how dg identifies your package as a Dagster project. This is required.
  • project.root_module = "my_existing_project": This points to the root module of your project. This is also required.
  • project.code_location_target_module = "my_existing_project.definitions": This tells dg where to find the top-level Definitions object in your project. This actually defaults to [root_module].definitions, so it is not strictly necessary for us to set it here, but we are including this setting in order to be explicit--existing projects might have the top-level Definitions object defined in a different module, in which case this setting is required.

Now that these settings are in place, you can interact with your project using dg. If we run dg list defs we can see the sole existing asset in our project:

dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ my_asset │ default │ │ │ │ │
│ │ └──────────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴─────────────────────────────────────────────────────┘

Add a dagster_dg.plugin entry point

We're not quite done adding configuration. dg uses the Python entry point API to expose custom component types and other scaffoldable objects from user projects. Our entry point declaration will specify a submodule as the location where our project exposes plugin objects. By convention, this submodule is named <root_module>.lib. In our case, it will be my_existing_project.lib. Let's create this submodule now:

mkdir my_existing_project/lib && touch my_existing_project/lib/__init__.py
tip

See the plugin guide for more on dg plugins.

We'll need to add a dagster_dg.plugin entry point to our project and then reinstall the project package into our virtual environment. The reinstallation step is crucial. Python entry points are registered at package installation time, so if you simply add a new entry point to an existing editable-installed package, it won't be picked up.

Entry points can be declared in either pyproject.toml or setup.py:

Since our package metadata is in pyproject.toml, we'll add the entry point declaration there:

pyproject.toml
...
[project.entry-points]
"dagster_dg.plugin" = { my_existing_project = "my_existing_project.lib"}
...

Then we'll reinstall the package. Note that uv sync will not reinstall our package, so we'll use uv pip install instead:

uv pip install --editable .

To make sure our plugin is working, let's scaffold a new component type and then make sure it's available to dg commands. First create the component type:

dg scaffold component-type Foo

Creating a Dagster component type at /.../my-existing-project/my_existing_project/lib/foo.py.
Scaffolded files for Dagster component type at /.../my-existing-project/my_existing_project/lib/foo.py.

Then run dg list plugins to confirm that the new component type is available:

dg list plugins
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Plugin ┃ Objects ┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ dagster │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓ │
│ │ ┃ Symbol ┃ Summary ┃ Features ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩ │
│ │ │ dagster.asset │ Create a │ [scaffold-ta… │ │
│ │ │ │ definition │ │ │
│ │ │ │ for how to │ │ │
│ │ │ │ compute an │ │ │
│ │ │ │ asset. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.asset_check │ Create a │ [scaffold-ta… │ │
│ │ │ │ definition │ │ │
│ │ │ │ for how to │ │ │
│ │ │ │ execute an │ │ │
│ │ │ │ asset check. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.components.DefinitionsComponent │ An arbitrary │ [component, │ │
│ │ │ │ set of │ scaffold-tar… │ │
│ │ │ │ dagster │ │ │
│ │ │ │ definitions. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.components.DefsFolderComponent │ A folder │ [component, │ │
│ │ │ │ which may │ scaffold-tar… │ │
│ │ │ │ contain │ │ │
│ │ │ │ multiple │ │ │
│ │ │ │ submodules, │ │ │
│ │ │ │ each │ │ │
│ │ │ │ which define │ │ │
│ │ │ │ components. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.components.PipesSubprocessScriptCollectionComponent │ Assets that │ [component, │ │
│ │ │ │ wrap Python │ scaffold-tar… │ │
│ │ │ │ scripts │ │ │
│ │ │ │ executed │ │ │
│ │ │ │ with │ │ │
│ │ │ │ Dagster's │ │ │
│ │ │ │ PipesSubpro… │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.multi_asset │ Create a │ [scaffold-ta… │ │
│ │ │ │ combined │ │ │
│ │ │ │ definition │ │ │
│ │ │ │ of multiple │ │ │
│ │ │ │ assets that │ │ │
│ │ │ │ are computed │ │ │
│ │ │ │ using the │ │ │
│ │ │ │ same op and │ │ │
│ │ │ │ same │ │ │
│ │ │ │ upstream │ │ │
│ │ │ │ assets. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.schedule │ Creates a │ [scaffold-ta… │ │
│ │ │ │ schedule │ │ │
│ │ │ │ following │ │ │
│ │ │ │ the provided │ │ │
│ │ │ │ cron │ │ │
│ │ │ │ schedule and │ │ │
│ │ │ │ requests │ │ │
│ │ │ │ runs for the │ │ │
│ │ │ │ provided │ │ │
│ │ │ │ job. │ │ │
│ │ ├─────────────────────────────────────────────────────────────┼──────────────┼───────────────┤ │
│ │ │ dagster.sensor │ Creates a │ [scaffold-ta… │ │
│ │ │ │ sensor where │ │ │
│ │ │ │ the │ │ │
│ │ │ │ decorated │ │ │
│ │ │ │ function is │ │ │
│ │ │ │ used as the │ │ │
│ │ │ │ sensor's │ │ │
│ │ │ │ evaluation │ │ │
│ │ │ │ function. │ │ │
│ │ └─────────────────────────────────────────────────────────────┴──────────────┴───────────────┘ │
│ my_existing_project │ ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ │
│ │ ┃ Symbol ┃ Summary ┃ Features ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │
│ │ │ my_existing_project.lib.Foo │ COMPONENT SUMMARY HERE. │ [component, scaffold-target] │ │
│ │ └─────────────────────────────┴─────────────────────────┴──────────────────────────────┘ │
└─────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────────┘

You should see the my_project.lib.MyComponentType listed in the output. This means our plugin entry point is working.

Create a defs directory

Part of the dg experience is autoloading definitions. This means automatically picking up any definitions that exist in a particular module. We are going to create a new submodule named my_existing_project.defs (defs is the conventional name of the module for where definitions live in dg) from which we will autoload definitions.

mkdir my_existing_project/defs

Modify top-level definitions

Autoloading is provided by a function that returns a Definitions object. Because we already have some other definitions in our project, we'll combine those with the autoloaded ones from my_existing_project.defs.

To do so, you'll need to modify your definitions.py file, or whichever file contains your top-level Definitions object.

You'll autoload definitions using load_defs, then merge them with your existing definitions using Definitions.merge. You pass load_defs the defs module you just created:

import dagster as dg
from my_existing_project.assets import my_asset

defs = dg.Definitions(
assets=[my_asset],
)

Now let's add an asset to the new defs module. Create my_existing_project/defs/autoloaded_asset.py with the following contents:

import dagster as dg


@dg.asset
def autoloaded_asset(): ...

Finally, let's confirm the new asset is being autoloaded. Run dg list defs again and you should see both the new autoloaded_asset and old my_asset:

dg list defs
┏━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Section ┃ Definitions ┃
┡━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ Assets │ ┏━━━━━━━━━━━━━━━━━━┳━━━━━━━━━┳━━━━━━┳━━━━━━━┳━━━━━━━━━━━━━┓ │
│ │ ┃ Key ┃ Group ┃ Deps ┃ Kinds ┃ Description ┃ │
│ │ ┡━━━━━━━━━━━━━━━━━━╇━━━━━━━━━╇━━━━━━╇━━━━━━━╇━━━━━━━━━━━━━┩ │
│ │ │ autoloaded_asset │ default │ │ │ │ │
│ │ ├──────────────────┼─────────┼──────┼───────┼─────────────┤ │
│ │ │ my_asset │ default │ │ │ │ │
│ │ └──────────────────┴─────────┴──────┴───────┴─────────────┘ │
└─────────┴─────────────────────────────────────────────────────────────┘

Now your project is fully compatible with dg!

Next steps