Python SDK Reference¶

Log parameter¶

Logs an individual input parameter when it is called. Logged parameters are accessible programmatically or through the GUI as soon as this function is called within your job.

Python

foundations.log_param(key, value)

Arguments

key (str): The name of the input parameter.
value (number, str, bool, array of [number|str|bool], array of array of [number|str|bool]): the value associated with the given input parameter.

Returns

This function doesn't return a value.

Raises

TypeError: When a value of a non-supported type is provided as the metric value.

Note

Multiple calls with the same key during the same job will overwrite the previously logged value.

Example

import foundations
foundations.log_param("learning rate", 0.001)

Log parameter dictionary¶

Similar to log_param, but accepts a dictionary of key-value pairs.

Python

foundations.log_params({})

Arguments

dict : Dictionary of parameters to log. Each key-value pair needs to satisfy the same constraints as that of log_param

Raises

TypeError: When a value of a non-supported type is provided as the metric value.

Returns

This function doesn't return a value.

Example

import foundations
foundations.log_params({"learning_rate": 0.001,
                        "batch_size": 32,
                        "epochs": 75})

Log metric¶

Logs a metric when it is called. Logged metrics are accessible programmatically or through GUI as soon as this function is called within your job. e.g. this can happen at the end of every epoch to get updated metrics live.

Note

Currently logging numpy types is not supported.

foundations.log_metric(key, value)

Arguments

key (str): the name of the output metric.
value (number, str, bool, array of [number|str|bool], array of array of [number|str|bool]): the value associated with the given output metric.

Returns

This function doesn't return a value.

Raises

TypeError: When a value of a non-supported type is provided as the metric value.

Note

Multiple calls with the same key during the same job will create and append to a list containing the previously logged values.

Example

import foundations
foundations.log_metric("accuracy", 0.90)
foundations.log_metric("accuracy", 0.93)

Set tag¶

Sets a tag when it is called. Tags accessible programmatically or through GUI as soon as this line runs within your job. Job tags can also be modified within the GUI.

foundations.set_tag(key)

Arguments

key ([number|str]): the name of the tag, displayed on the GUI

Returns

This function doesn't return a value.

Raises

TypeError: When a value of a non-supported type is provided as the tag value.

Example

import foundations
foundations.set_tag("CNN")

Save artifact¶

Logs an artifact to a job when called. Artifacts can be images, audio clips, text files or serialized python objects. The artifact must be saved to disk first

foundations.save_artifact(filepath, key)

Arguments

filepath ([str]): path of the artifact saved to disk that needs to be logged
key ([number|str]): friendly name associated with the artifact

Returns

This function doesn't return a value.

Notes

Artifacts must be saved to disk before logging.

Example

import foundations
foundations.save_artifact("train_val_loss.png", "Loss_Curve")

Job submission¶

Submits a job to the Atlas Scheduler.

Arguments

scheduler_config ([str]): Name of the scheduler. Should always be scheduler for Atlas CE
job_directory ([str]): Default cwd. Optional argument to specify job directory
project_name ([str]): Defaults to current working directory. Optional argument to specify project name. This will take precedence over job.config.yaml
entrypoint ([str]): Optional argument to override the Docker entrypoint of the worker container
command ([list of str]): List of commands to pass to worker. Typically ['main.py', 'arg1', 'arg2']
num_gpus ([int]): Default 0. Used to set whether to run the worker with GPU support. Any positive number other than 0 will mount all available GPU devices inside the worker
params ([dict]): Optional argument. Allows you specify parameters for a job. This should be done in JSON serializable dictionary, where values must be supported by foundations.load_parameters(). Upon calling load_parameters() within job, this param argument will be returned to that job process. See load_parameters() docs for loading in parameters.
stream_job_logs ([bool]): Default True. Optional argument to specify if logs should be streamed to the console

Returns

deployment (Object) -- A deployment object which can be used to interact with the job

Notes

The project requirements.txt will not be automatically installed if the worker entrypoint is overridden using submit, please see Custom workers docs for more details.

Example

import foundations
foundations.submit(scheduler_config="scheduler",
                    command= ["main.py", "myarg1", "myarg2"],
                    num_gpus=1,
                    stream_job_logs=False)

BETA: Deployment Object¶

The object returned by job_deployment = foundations.submit(...) contains information about the job that it just launched. In it's current form, there are 3 supported functions.

# Get back a specific parameter for the job
job_deployment.get_param(param_name: str) -> str

# Get back a specific metric for the job
job_deployment.get_metric(metric_name: str) -> str

# Get back a dictionary that contains the information stored in the jobs row on the GUI
job_deployment.get_job_details() -> dict

Note

All of the calls are blocking. This means that if you call it on a job that is not finished, the function call will wait until the job to finish.

Warning

For hyperparameter search, we normally recommend setting the FOUNDATIONS_COMMAND_LINE environment variable to True to make sure that the search script does not run as a job. However, for the job deployment object to work it needs this environment variable to be either set to False or not set at all.

This means that your search script will show up as a job in the GUI. This "job" will run as long as the search script takes and act strangly within the GUI (e.g. no logs will appear).

We are aware of this annoyance and have a fix in the works!

Get project metrics¶

Retrieve metadata, hyper-parameters, metrics & tags for all jobs associated with a project

foundations.get_metrics_for_all_jobs(project_name, include_input_params=False)

Arguments

project_name ([str]): Name of the project to filter by
include_input_params ([bool]): Default False. Optional way to specify if metrics should include all model input metrics

Returns

metrics (DataFrame) -- A Pandas DataFrame containing all of the results

Raises

ValueError -- An exception indicating that the requested project does not exist

Notes

Artifacts must be saved to disk before logging.

Example

import foundations
foundations.get_metrics_for_all_jobs("my_project")

Load parameters¶

Loads job parameters from a file called foundations_job_parameters.json that must exist in the root of the project as a dictionary. This will also log all loaded parameters in the GUI by default.

foundations.load_parameters(log_parameters=True)

Arguments

log_parameters (bool): Default True. Optional way to specify whether or not to log all parameter values in the GUI and SDK for the job.

Returns

parameters (dict): A dictionary of all the user-defined parameters for the model, from foundations_job_parameters.json.

Raises

FileNotFoundError: When the foundations_job_parameters.json file is not found in the deployment directory.

Example

Sample foundations_job_parameters.json:

{
    "learning_rate": 0.125,
    "layers": [
        {
            "neurons": 5
        },
        {
            "neurons": 6
        }
    ]
}

params = foundations.load_parameters()

Syncable directories¶

Foundations offers an interface to sync a directory within a job to a centralized location outside of that job. This directory can then be synced from a different job, allowing you to grab information from past jobs to know what has happened in before or build on the shoulders of giants (with giants being your own previous work).

This feature will be useful for advanced model search algorithms that the user may want to do, especially paired with jobs launching other jobs. The synced directories can be used to quickly achieve genetic search algoritms or Bayesian optimization.

foundations.artifacts.create_syncable_directory(key, directory_path=None, source_job_id=None)

Arguments

key (str): What your directory is called in the centralized location.
directory_path (str): Default None. The path to the directory within your jobs environment.
source_job_id (str): Default None. The ID of a previous job that has a directory by the same name as the value given to "key". If this is not specified, the current job ID is used.

Returns

syncable_directory (SyncableDirectory):

Examples

The following example shows how you can create and write to a syncable directory from within a job, and then read and write to the same directory from following jobs.

import foundations
import pandas as pd

df = pd.DataFrame([[1, 2, 3]])

directory = foundations.create_syncable_directory("directory_key", "sync/path")
df.to_csv("sync/path/hello.csv")
directory.upload()

If the job gives back the job ID 42, you can use this to read the saved files from any following job.

import foundations
import pandas as pd

directory = foundations.create_syncable_directory("directory_key", "sync/path", "42")
df = pd.read_csv("sync/path/hello.csv")

If you want to write back to the same directory, do so the same way that you did in the first job.

import foundations
import pandas as pd

directory = foundations.create_syncable_directory("directory_key", "sync/path", "42")
df = pd.read_csv("sync/path/hello.csv")

new_df = df + 3
new_df.to_csv("sync/path/hello.csv")
directory.upload()

NOTE: To access the the directory that a job uploaded to, in the state that you expect, always use that job's ID. Example: If you have 5 jobs that all read and write to a syncable directory with the same key, always use the previous job's ID.

Syncing Tensorboard log directory¶

An extra special form of a syncable directory provides the ability to sync a regular Tensorboard logdir to a centralized storage location. Doing this not only allows you to retrieve files later while tying them to a specific job, but also automatically adds a tag to the job for you. Any job that has this tag can be sent to a Tensorboard server directly from the GUI.

foundations.set_tensorboard_logdir(path)

Arguments

path (str): The path to your Tensorboard logdir within the jobs environment.