tf.train.MonitoredTrainingSession(master='', is_chief=True, checkpoint_dir=None, scaffold=None, hooks=None, chief_only_hooks=None, save_checkpoint_secs=600, save_summaries_steps=100, config=None)
See the guide: Training > Distributed execution
Creates a MonitoredSession
for training.
For a chief, this utility sets proper session initializer/restorer. It also creates hooks related to checkpoint and summary saving. For workers, this utility sets proper session creator which waits for the chief to inialize/restore.
master
: String
the TensorFlow master to use.is_chief
: If True
, it will take care of initialization and recovery the underlying TensorFlow session. If False
, it will wait on a chief to initialize or recover the TensorFlow session.checkpoint_dir
: A string. Optional path to a directory where to restore variables.scaffold
: A Scaffold
used for gathering or building supportive ops. If not specified, a default one is created. It's used to finalize the graph.hooks
: Optional list of SessionRunHook
objects.chief_only_hooks
: list of SessionRunHook
objects. Activate these hooks if is_chief==True
, ignore otherwise.save_checkpoint_secs
: The frequency, in seconds, that a checkpoint is saved using a default checkpoint saver. If save_checkpoint_secs
is set to None
, then the default checkpoint saver isn't used.save_summaries_steps
: The frequency, in number of global steps, that the summaries are written to disk using a default summary saver. If save_summaries_steps
is set to None
, then the default summary saver isn't used.config
: an instance of tf.ConfigProto
proto used to configure the session. It's the config
argument of constructor of tf.Session
.A MonitoredSession
object.
Defined in tensorflow/python/training/monitored_session.py
.
© 2017 The TensorFlow Authors. All rights reserved.
Licensed under the Creative Commons Attribution License 3.0.
Code samples licensed under the Apache 2.0 License.
https://www.tensorflow.org/api_docs/python/tf/train/MonitoredTrainingSession