Reading Note: Realtime Machine Learning the Missing Pieces

Reading Note: Realtime Machine Learning the Missing Pieces

Context:

ML has predominantly focused on training and serving predictions based on static models

  • Supervised learning paradigm
  • Static models are trained on offline data

There is a strong shift toward tight integration of ML models in feedback loops.

  • Broader paradigm (RL)
  • Applications may operate in real environments
  • Fuse and react to sensory data from numerous input streams
  • Perform continuous micro-simulations
  • Close the loop by taking actions that affect sensed environment

Learning by interacting with real world can be:

  • Unsafe
  • Impractical
  • Bandwidth-limited

Many RL systems rely heavily on simulating physical or virtual environments

  • Simulations may be used during training
    • Eg: learn a neural network policy
  • Simulations may be used during deployment
    • Constantly update simulated environment when interacting with real world
    • Perform many simulations to figure out the next action
      • Eg: use online planning algorithms like Monte Carlo tree search
    • Requires to perform simulations faster than real time

Nature of emerging ML applications:

  • Real-time
  • Reactive
  • interactive

Emerging ML applications require new levels of programming flexibility and performance

  • Performance requirements
    • Low latency: fine-granularity task execution with millisecond end-to-end latency
    • High throughput: task execution on the order of millions of tasks per second [16, 19]
  • Execution model requirements [10]
    • Dynamic task creation
      • RL primitives such as Monte Carlo tree search may generate new tasks during execution based on results or durations of other tasks
    • Heterogeneous tasks
    • Arbitrary dataflow dependencies
  • Practical requirements
    • Transparent fault tolerance
    • Debuggability and profiling
      • Are the most time-consuming aspects of writing any distributed applications

Static dataflow systems:

  • Are well-established in analytics and ML
  • But require dataflow graph to be specified upfront, eg by driver program
  • MapReduce/Spark: emphasize Bulk Synchronous Parallel (BSP)
  • Dryad/Naiad: support complex dependency structures
  • TensorFlow/MXNet: optimized for deep-learning workloads
  • None of them support ability to dynamically extend dataflow graph in response to both input data and task progress

References

[16] Massively parallel methods for deep reinforcement learning, 2015.

[19] Mastering the game of Go with deep neural networks and tree search. Nature 529, 7587 (2016)

[10] Benchmarking deep reinforcement learning for continuous control, ICML 2016

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s