вторник, 18 августа 2020 г.

go-data-routing

 

go-data-routing

The go-data-routing library provides a  #DSL for stream-oriented processing of data. Based on the concepts of #EIP (enterprise integration patterns) and concurrency primitives of #golang.

The motivation for this project is simple: to get an easy and clear way of coding ETL-like programs for parallel processing of data. In my case it was a BFS crawler tuned for extraction of specific metadata, (see a basic version in `example` folder).  

Features

The library provides the following primitives:

  • route (chain of nodes processing messages)
  • node:
    • filter
    • processor -- processes a stream of tasks in parallel
    • wire tap: sends a copy of msg to another route (referenced by name)
    • to: enrich msg on another route (request-reply / enrichment pattern)

All the primitives are accessible through DSL.

Design of node:

  • each node is connected with the next one (if exists) only with 1 channel
  • node owns an input channel
  • output is just a reference to the input of next node
  • node does not close the output channel, instead it just sends a Stop msg to a next node
  • if a node is the last in a chain than an output message being sent is discarded unless it's not a RequestReply