Hello!
I am curious if there is a way to utilize the workdir caching functionality to cache the source code of a python operation (and execute in the same thread without a cluster) without caching the output of that python operation. I want to use this to make my code for my outermost execution layer robust to working code changes, while the inner operations of my code will have their own tasks configured to cache their outputs and utilize appropriate resources for each individual task. My outermost execution flow changes almost every time so it doesn't make much sense to cache this, and will just create clutter if left enabled.
When I try to implement it using the following configurations (with the "folder" operation specified in the WorkDir config but not in the outer "infra" config, I get the error below:
infra = {
"version" : "1",
"workdir" : {
"folder": cache_path,
"copied" : [filename, "alljoined"]
},
"folder": None, #cfg.paths.exca_cache_path,
"cluster": None
}
Produces the following error when executed.
Traceback (most recent call last):
File "execute_exca.py", line 52, in <module>
task = CachedTask(cfg=cfg, infra=infra)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "<env>/lib/python3.12/site-packages/pydantic/main.py", line 253, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for CachedTask
infra
Value error, Workdir requires a folder [type=value_error, input_value={'version': '1', 'workdir...: None, 'cluster': None}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.11/v/value_error
Hello!
I am curious if there is a way to utilize the workdir caching functionality to cache the source code of a python operation (and execute in the same thread without a cluster) without caching the output of that python operation. I want to use this to make my code for my outermost execution layer robust to working code changes, while the inner operations of my code will have their own tasks configured to cache their outputs and utilize appropriate resources for each individual task. My outermost execution flow changes almost every time so it doesn't make much sense to cache this, and will just create clutter if left enabled.
When I try to implement it using the following configurations (with the "folder" operation specified in the WorkDir config but not in the outer "infra" config, I get the error below:
Produces the following error when executed.