Flink Forward 2018
Flink Forward 2018
<h1 class="semi-opaque">
Flink Forward 2018
<h2>About Flink Forward 2018</h2>
<li>09-10 Abril 2018</li>
<li>Organized by dataArtisans</li>
<img src="_images/ff_map.png" alt=""/>
969 Market Street<br/>
San Francisco, CA
<section data-background="_images/ff_dayone.jpg">
<section data-background="_images/ff_watermarks0.png">
<img src="_images/ff_watermark1.png" alt=""/>
<img src="_images/ff_watermark2.png" alt=""/>
<img src="_images/ff_watermark3.png" alt=""/>
<img src="_images/ff_watermark4.png" alt=""/>
<img src="_images/ff_watermark5.png" alt=""/>
<section data-background="_images/ff_broadcast0.png"></section>
<img src="_images/ff_broadcast1.png" alt=""/>
<section data-background="_images/ff_state0.png"></section>
<img src="_images/ff_state1.png" alt=""/>
<img src="_images/ff_state2.png" alt=""/>
<img src="_images/ff_state3.png" alt=""/>
<img src="_images/ff_state4.png" alt=""/>
<img src="_images/ff_state5.png" alt=""/>
<img src="_images/ff_state6.png" alt=""/>
Testing Harnesses for Operators
<section data-background="_images/ff_harness0.png"></section>
<img src="_images/ff_harness1.png" alt=""/>
State and Schema Migration
<section data-background="_images/ff_migration0.png"></section>
<img src="_images/ff_migration1.png" alt=""/>
<img src="_images/ff_migration2.png" alt=""/>
<img src="_images/ff_migration3.png" alt=""/>
<img src="_images/ff_migration4.png" alt=""/>
Exactly-Once Processing
<section data-background="_images/ff_exactlyonce0.png"></section>
<img src="_images/ff_exactlyonce1.png" alt=""/>
<img src="_images/ff_exactlyonce2.png" alt=""/>
<img src="_images/ff_exactlyonce3.png" alt=""/>
Deployment and FLIP-6
<section data-background="_images/ff_flip60.png"></section>
<img src="_images/ff_flip61.png" alt=""/>
<img src="_images/ff_flip62.png" alt=""/>
<img src="_images/ff_flip63.png" alt=""/>
<img src="_images/ff_flip64.png" alt=""/>
<img src="_images/ff_flip65.png" alt=""/>
<img src="_images/ff_flip66.png" alt=""/>
<img src="_images/ff_flip67.png" alt=""/>
Running Flink 24-7
<section data-background="_images/ff_capacity0.png"></section>
<img src="_images/ff_capacity1.png" alt=""/>
<img src="_images/ff_capacity2.png" alt=""/>
Metrics, Monitoring and Troubleshooting
<section data-background="_images/ff_monitoring0.png"></section>
<img src="_images/ff_monitoring1.png" alt=""/>
<img src="_images/ff_monitoring2.png" alt=""/>
<img src="_images/ff_monitoring3.png" alt=""/>
<img src="_images/ff_monitoring4.png" alt=""/>
<img src="_images/ff_monitoring5.png" alt=""/>
<img src="_images/ff_monitoring6.png" alt=""/>
<section data-background="_images/ff_daytwo.jpg">
What turns stream processing from a tool into a platform?<br/>
<small>Stephan Ewen, data Artisans</small>
Stream Processing Revolutionizing Big Data<br/>
Srikanth Satya, Dell EMC
Apache Flink + Apache Beam: Expanding the horizons of Big Data<br/>
Anand Iyer, Google Cloud
Powering Yelp’s Data Pipeline Infrastructure with Apache Flink</br><small>Enrico Canzonieri, Yelp Inc.</small>
<td class="seen">
Scaling stream data pipelines <br/><small>Flavio Junqueira, Pravega by DellEMC</small>
<aside class="notes">
- Pravega
- Kinda like Kafka
- Auto-partition
- When a partition is created or destroyed a Flink TaskManager is created/destroyed
- Pravega requires a key already
Building a scalable focused web crawler with Flink <br/><small>Ken Krugler, Scale Unlimited</small>
<img src="_images/ff_pravega.png" alt=""/>
<td>Building Flink As a Service platform at Uber</br><small>Shuyi Chen, Uber</small></td>
<td class="seen">eBay monitoring platform preprocessing and alerting on Flink</br><small>Garret Li, ebay</small></td>
<aside class="notes">
- Sherlock.IO
- "The Frink", nervous
- Policy changes go through the pipeline
<td>Using Flink for balances and controls across platform boundaries<br/><small>Faraz Babar, American Express</small></td>
<img src="_images/ff_sherlock.jpg" alt=""/>
<img src="_images/ff_sherlock_policy.png" alt=""/>
<tr><td>Scaling Flink in Cloud<br/><small>Steven Wu, Netflix</small></td></tr>
<tr><td>Flink real-time analysis in CloudStream Service of Huawei Cloud<br/><small>Jinkui Shi, Huawei</small></td></tr>
<td class="seen">Why and how to leverage the simplicity and power of SQL on Flink<br/><small>Fabian Hueske, data Artisans</small></td>
<aside class="notes">
- Built-in in Flink 1.5
- Can create pipelines on the fly
- Launches TaskManagers
<img src="_images/ff_sql.png" alt=""/>
<tr><td>Optimizations in Blink Runtime for Global Shopping Festival at Alibaba<br><small>Feng Wang, Alibaba </small></td></tr>
<td class="seen">Bootstrapping State In Apache Flink<br/><small>Gregory Fee, Lyft </small></td>
<aside class="notes">
- Lyft
- Grouping all information, including old ones
- Required for information of all rides of all users over all the time
- Old data on S3, new data on Kafka
- Multiple inputs
- Old input will hold new input
<tr><td>Panta Rhei: designing distributed applications with streams<br/><small> Aris Koliopoulos, Drivetribe </small></td></tr>
<img src="_images/ff_lyft.png" alt=""/>
<tr><td> Powering Tensorflor with Big Data (Apache BEAM &amp; Flink) <br/><small> Holden Karau, Google Cloud </small></td></tr>
<tr><td class="seen">Alibaba’s common algorithm platform on Flink<br/><small> Xu Yang, Alibaba </small></td></tr>
<tr><td> Cloud Native Flink <br/><small> Jowanza Joseph, One Click Retail </small></td></tr>
<tr><td> Embedding Flink Throughout an Operationalized Streaming ML Lifecycle <br/><small> Dave Torok, Comcast Corporation </small></td></tr>
<tr><td>Scaling Uber’s Realtime Optimization with Apache Flink <br/><small> Xingzhong Xu, Uber Technologies Inc. </small></td></tr>
<td class="seen"> Operating Flink on Mesos at Scale <br/><small> Jörg Schad, Mesosphere </small></td>
<aside class="notes">
- Basically, the same Kubernetes does
- Use of Kube or Mesos is a personal choice
<img src="_images/ff_mesos.png" alt=""/>
<td class="seen"> Finding Bad Acorns <br/><small> Andrew Gao, Capital One </small></td>
<aside class="notes">
- Teller validation
- Python integration (because data scientists use Jupyter)
- Queryable state
- Broke the monolith
<tr><td> dA Platform – Production-ready stream processing with Apache Flink <br/><small> Robert Metzger, data Artisans </small></td></tr>
<tr><td> Real-time monitoring of Mobile Internet Quality of Experience using Flink <br/><small> David Reniz, Everis </small></td></tr>
<img src="_images/ff_acorns_python.png" alt=""/>
<img src="_images/ff_acorns_state.png" alt=""/>
<tr><td> How to build a modern stream processor: The science behind Apache Flink <br/><small> Stefan Richter, data Artisans </small></td></tr>
<td class="seen"> Testing Stateful Streaming Applications <br/><small> Seth Wiesman, MediaMath </small></td>
<aside class="notes">
- Math guy
- Test pipeline, break pipeline to test checkpoint
- WHYYYYY?!?!?!
<tr><td> Extending Flink metrics: Real-time BI atop existing Flink streaming pipelines <br/><small> Andrew Torson, Walmart Labs </small></td></tr>
