вторник, 18 августа 2020 г.

go-data-routing

 

go-data-routing

The go-data-routing library provides a  #DSL for stream-oriented processing of data. Based on the concepts of #EIP (enterprise integration patterns) and concurrency primitives of #golang.

The motivation for this project is simple: to get an easy and clear way of coding ETL-like programs for parallel processing of data. In my case it was a BFS crawler tuned for extraction of specific metadata, (see a basic version in `example` folder).  

Features

The library provides the following primitives:

  • route (chain of nodes processing messages)
  • node:
    • filter
    • processor -- processes a stream of tasks in parallel
    • wire tap: sends a copy of msg to another route (referenced by name)
    • to: enrich msg on another route (request-reply / enrichment pattern)

All the primitives are accessible through DSL.

Design of node:

  • each node is connected with the next one (if exists) only with 1 channel
  • node owns an input channel
  • output is just a reference to the input of next node
  • node does not close the output channel, instead it just sends a Stop msg to a next node
  • if a node is the last in a chain than an output message being sent is discarded unless it's not a RequestReply

суббота, 29 февраля 2020 г.

go-pool-of-workers

This is a simplistic Go implementation of the pool of workers [1].
When you might need it:
If you want to handle a number/stream of jobs in parallel but with a limited number of goroutines.
All you need is:
  • a job Runner -- a handler for a specific unit of work
  • an optional callback to handle a result of job
Code example:
type Job struct{ result SomeType }

func (r *Job) Run() {
    time.Sleep(200 * time.Millisecond)
    r.result = x
}

func main() {
    p := pool.NewPool(2, 4) // minWorkers, maxWorkers
    p.fnOnResult = func(handledJob Job) {}

    tasks := 10
    for tasksCnt > 0; tasksCnt-- {
        p.Submit(&Job{})
    }
    p.Stop()
}
Design
The key difference from canonical pool [2] is an absence of a common job queue. Instead, there is a common queue of idle workers [1][2].
Features / properties
  • The producer (submitting a job) is unlocked as soon as a idle worker consumes the job, thus potentially reducing the time producer is blocked.
  • Number of workers increases on demand from minWorkers to maxWorkers
  • You may have as many types of jobs as you like
References
  1. Go: Worker Pool vs Pool of Workers
  2. Go by Example: Worker Pools
  3. Handling 1 Million Requests per Minute with Go