Experiments¶
The top level object in every configuration file must be a Runnable
, the most
common and useful being the Experiment
class which facilitates
executing a ML pipeline.
The Experiment
’s most important parameter is the pipeline
, where users can define
a DAG of Components
describing how dataset, models, training procedures, etc interact
between them.
Attention
For a full specification of Experiment
, see Experiment
The implementation of Experiment
and its pipeline` uses
Ray’s Tune under the hood.
Pipeline¶
A pipeline
is defined as a list of Components
that will be executed sequentiall.
Each Component
is identified by a key that can be used for later linking.
Let’s assume that we want to define an experiment that consists on:
- Pick dataset A and preprocess it.
- Train model A on dataset A.
- Preprocess dataset B.
- Finetune the model trained in 2. on dataset B.
- Evaluate the fine tuned model on dataset A testset.
All these stages can be represented by a sequential pipeline
in a simple
and readable way:
pipeline:
dataset_A: !SomeDataset
...
model_A: !SomeModel
...
trainer: !Trainer
model: !@ model_A
dataset: !@ dataset_A
...
dataset_B: !Trainer
...
fine_tunning: !Trainer
model: !@ trainer.model
dataset: !@ dataset_B
...
eval: !Evaluator
model: !@ trainer.model
dataset: !@ dataset_A
Note how this represents a DAG where the nodes are the Components
and the edges
are the links to attributes of previouly defined Components
.
Linking¶
As seen before in Quickstart, stagegs in the pipeline
are
connected using Links
.
Links can be used anywhere in the pipeline
to refer to earlier components or any of their attributes.
During the compilation that is described above in Delayed Initialization we actually resolve the links to their intended value, but cache the original link representation so that we can dump back to YAML with the original links later.
Search Options¶
Experiment
supports declaring multiple variants in the pipeline
by making use of
the search tags:
!Experiment
...
pipeline:
...
model: !TextClassifier
...
n_layers: !g [2, 3, 4]
...
The value !g [2, 3, 4]
indicates that each of the
values should be tried. Flambé will create internally 3 variants of the model.
You can specify grid search options search for any parameter in your config,
without changing your code to accept a new type of input! (in this case
n_layers
still receives an int
)
Tip
You can also search over Components
or even links:
!Experiment
...
pipeline:
dataset: !SomeDataset
transform:
text: !g
- !SomeTextField {{}} # Double braces needed here
- !SomeOtherTextField {{}}
Types of search options
!g
Previously shown. It grids over all its values
param: !g [1, 2, 3] # grids over 1, 2 and 3. param: !g [0.001, 0.01] # grids over 0.001 and 0.01
!s
Yields k values from a range (low, high). If both
low
andhigh
are int values, then!s
will yield int values. Otherwise, it will yield float values.param: !s [1, 10, 5] # yiels 5 int values from 1 to 10 param: !s [1.5, 2.2, 5] # yiels 5 float values from 1.5 to 2.2 param: !s [1.5, 2.2, 5, 2] # yiels 5 float values from 1.5 to 2.2, rounded to 2 decimals
Combining Search tags
Search over different attributes at the same time will have a combinatorial effect.
For example:
!Experiment
...
pipeline:
...
model: !TextClassifier
...
n_layers: !g [2, 3, 4]
hidden_size: !g [128, 256]
This will produce 6 variants (3 n_layers
values times 2 hidden_size
values)
Variants inheritance
Attention
Any object that links to an attribute of an object that describes multiple variants will inherit those variants.
!Experiment ... pipeline: ... model: !TextClassifier n_layers: !g [2, 3, 4] hidden_size: !g [128, 256] ... trainer: !Trainer model: !@ model lr: !g [0.01, 0.001] ... evaluator: !Evaluator model: !@ trainer.model
The trainer
will have 12 variants (6 from model
times 2 for the lr
).
eval
will run for 12 variants as it links to trainer
.
Reducing¶
Experiment
provides a reduce
mechanism so that variants don’t flow down the pipeline
.
reduce is declared at the Experiment
level and it can specify the number of variants to reduce
to for each Component
.
!Experiment
...
pipeline:
...
model: !TextClassifier
n_layers: !g [2, 3, 4]
hidden_size: !g [128, 256]
trainer: !Trainer
model: !@ model
lr: !g [0.01, 0.001]
evaluator: !Evaluator
...
model: !@ trainer.model
reduce:
trainer: 2
Flambé will then pick the best 2 variants before finishing executing ``trainer``. This means
eval
will receive the best 2 variants only.
Resources (Additional Files and Folders)¶
The resources
argument lets users specify files that can be used in the
Experiment
(usually local datasets, embeddings or other files).
In this section, you can put your resources under local
or remote
.
Local resources: The local
section must include all local files.
Remote resources: The remote
section must contain all files that are going to be located in the
instances and must not be uploaded. This feature is only useful when running remotely (read Clusters)
!Experiment
...
resources:
local:
data: path/to/train.csv
embeddings: path/to/embeddings.txt
remote:
remote_embeddings: /file/in/instance/
...
Attention
The remote
section is only useful in remote experiments. If the user is running local experiments, then only the local
section should be used.
resources
can be referenced in the pipeline via linking:
!Experiment
...
resources:
...
local:
embeddings: path/to/embeddings.txt
pipeline:
...
some_field: !@ embeddings
Scheduling and Reducing Strategies¶
When running a search over hyperparameters, you may want to run a more sophisticated scheduler. Using Tune, you can already use algorithms such as HyperBand, and soon more complex search algorithms like HyperOpt will be available.
schedulers:
b1: !tune.HyperBandScheduler
pipeline:
b0: !ext.TCProcessor
dataset: !ext.SSTDataset
b1: !Trainer
train_sampler: !BatchSampler
data: !@ b0.train
batch_size: !g [32, 64, 128]
model: ...
b2: !Evaluator
model: !@ b1.model
General Logging¶
We adopted the standard library’s logging module for logging:
1 2 3 4 5 6 | import logging
logger = logging.getLogger(__name__)
...
logger.info("Some info here")
...
logger.error("Something went wrong here...")
|
The best part of the logging paradigm is that you can instantly start logging in any file in your code without passing any data or arguments through your object hierarchy.
Important
By default, only log statements at or above the INFO
log level will be shown
in the console. The rest of the logs will be saved in ~/.flambe/logs
(more on this
in Debugging)
In order to show all logs in the console, you can use the --vebose
flag
when running flambé:
flambe my_config_file.yaml --verbose
Tensorboard Logging¶
Flambé provides full integration with Tensorboard. Users can easily have data routed to Tensorboard through the logging interface:
1 2 3 4 | from flambe import log
...
loss = ... # some calculation here
log('train loss', loss, step)
|
Where the first parameter is the tag which Tensorboard uses to name the value.
The logging system will automatically detect the type and make sure it goes to the right Tensorboard function.
See flambe.logging.log()
in the package reference.
Flambé provides also logging special types of data:
flambe.logging.log_image()
for imagesflambe.logging.log_histogram()
for distributions and histogramsflambe.logging.log_pr_curves()
for displaying PR curvesflambe.logging.log_text()
for displaying text
See the logging
for more information on how to use this logging methods.
Script Usage¶
If you’re using the flambe.learn.Script
object to wrap an existing piece
of code with a command-line based interface, all of the logging information above
still applies to you!
See more on Scripts in Converting a script to Flambé.
Checkpointint and Saving¶
As Quickstart explains, flambé saves an Experiment
in
a hierarchical way so that Components
can be accessed independant to each other.
Specifically, our save files are a directory by default, and
include information about the class name, version, source code, and YAML config,
in addition to the state that PyTorch normally saves, and any custom state
that the implementer of the class may have included.
For example, if you initialize and use the following object as a part of your Experiment
:
!TextClassifier
embedder: !Embedder
embedding: !torch.Embedding
input_size: !@ b0.text.vocab_size
embedding_size: 300
encoder: !PooledRNNEncoder
input_size: 300
rnn_type: lstm
n_layers: 2
hidden_size: 256
output_layer: !SoftmaxLayer
input_size: !@ b1.model.encoder.encoder.rnn.hidden_size
output_size: !@ b0.label.vocab_size
Then the save directory would look like the following:
save_path
├── state.pt
├── config.yaml
├── version.txt
├── source.py
├── embedder
│ ├── state.pt
│ ├── config.yaml
│ ├── version.txt
│ ├── source.py
│ ├── embedding
│ │ ├── state.pt
│ │ ├── config.yaml
│ │ ├── version.txt
│ │ └── source.py
│ └── encoder
│ ├── state.pt
│ ├── config.yaml
│ ├── version.txt
│ └── source.py
└── output_layer
├── state.pt
├── config.yaml
├── version.txt
└── source.py
Note that each subdirectory is self-contained: if it’s possible to load that object on its own, you can load from just that subdirectory.
Important
As seen before, each variant of a Component
will have it’s separate output folder.
Resuming¶
Experiment
has a way of resuming perviously run experiments:
!Experiment
resume: trainer
...
pipeline:
...
model: !TextClassifier
...
n_layers: !g [2, 3, 4]
hidden_size: !g [128, 256]
trainer: !Trainer
...
model: !@ model
lr: !g [0.01, 0.001]
other_trainer: !Trainer
...
model: !@ trainer.model
By providing a Component
keyname (or a list of them) that belong to the pipeline
, then
flambé will resume all blocks up until the given one (or ones).
Debugging¶
Experiment
has a debugging option that is only available in local executions (not remotely).
This is activated by adding debug: True
at the top level of the YAML.
When debugging is on, a debugger will appear before executing run
on each Component
.
Warning
Debugging is not enabled when running remote experiments.
Adding Custom State¶
Users can add other data to
the state that is saved in the save directory. If you just want to have some
additional instance attributes added, you can register them at the end of the
__init__
method:
class MyModel(flambe.nn.Module):
def __init__(self, x, ...):
super().__init__(...)
...
self.x = x,
self.y = None
self.register_attrs('x', 'y')
This will cause the get_state
method to start including x and y in the
state dict for instances of MyModel
, and when you load state into instances
of MyModel
it will know to update these attributes.
If you want more flexibility to manipulate the state_dict or add computed
properties you can override the _state()
and
_load_state()
methods.