Monday, October 28, 2019

Flink Window

Flink Window

Windows split the stream into "bucket" of finite size.

  • Keyed / no Keyed
    The first thing to be specified.
    • keyed windows
      • base on key, compute parallel by multiple tasked
      • .keyBy(...)
        .window(...)
    • No-keyed Windows
      • Single task (parallelism :1)
      • .windowAll(...)
  • Window Lifecycle
    • Created when the first event arrived.
    •  Removed (time based) when reach allowed lateness.
    • Trigger
      • Trigger policy examples
        • when windows have more than number of elements.
        • When the watermark passes the end of window.
      • Trigger interface
        • onElement()
        • onEventTime()
        • onProcessingTime()
        • onMerge()
        • clear()
      • TriggerResult
        • CONTIME
        • FIRE
        • PURGE
        • FIRE_AND_PURGE
      • Built-in and Custom Triggers 
        • EventTimeTrigger
        • ProcessTimeTrigger
        • CountTrigger
        • PurgingTrigger
        • Custom trigger : see the abstract Trigger class.
    • Function
      • ProcessWindowFunction() 
        • Has to buffer all elements internally before invoke
        • Use Context object with access to time and state information
      • ReduceFunction()
        • Efficient
      • AggregateFunction()
        • Efficient
        • A general version of Reduction. Take three types
          • An input type (IN)
          • Accumulator Type (ACC)
          • Output Type (OUT)
      • FoldFunction()
        • The first element is combined with a pre-defined initial value of the output type.
    • Evictors
      • Optional
      • evictor(...)
      • Interface
        • evictBefore()
        • evictAfter()
      • Pre-implemented evictors
        • CountEvictor
        • DeltaEvictor
        • TimeEvictor
  •  Assigner
    • Tumbling Windows 
      • Time-based
      • Fixed length of timestamp and no overlap.
      • .window(TumblingEventTimeWindows.of(Time.seconds(5)))
        
    • Sliding Windows
      • Time-based 
      • Can be overlappin based on the sliding parameter.
      • Optional of offset -- to be used in the different time zone.
      • .window(SlidingEventTimeWindows.of(Time.seconds(10), Time.seconds(5)))
        
    • Session Windows
      • No overlap.
      • No fixed start/end time.
      • Defined by session, session gap extractor function to define how long of inactivity.
      • When a session expired, create a new session for the subsequent elements.
      • Allow to merge
      •  .window(EventTimeSessionWindows.withGap(Time.minutes(10)))
         
    •  Global Windows
      • Can not aggregated since no ending.
      • Only useful for custom trigger.
      • .window(GlobalWindows.create())
        
    • Customized Windows : by extending WindowAssigner class.

    No comments: