About

Take home test for a dev role in a European ride-hailing company.

unfare

Calculation of fare estimates based on geographic time series data describing drives.

input structure

The input will be a long CSV file describing drives. Each line is of the form id_ride, lat, lng, timestamp and the file will be well-formed in the sense of no data multiplexing or time incosistencies.

challenges

weed out outlier data. These are tuples that are probably due to faulty equipement and are skewed in relation to their previous data points.
The input might be very long so we need to come up with a concurrent solution that scales well with input size.

architecture

We start reading through the input serially. This is done by the main goroutine. As we read in line by line (or tuple by tuple if you prefer) we fill in a temp bucket representing the current drive id. As soon as we see the next drive id we know that the bucket is full and we have all data for the previous drive id.

We then start a worker goroutine to process that drive concurrently and we go on with filling the next bucket. So, essentially we have a driver goroutine (the main goroutine) assigning work to worker goroutines. As the worker goroutines are usually a relativelly short computation we can expect their number to remain stable regardless of how long the driver goroutine (i.e. the input) might be. Which is, we expect this solution to scale well.

A merger goroutine is awaiting the results of the worker routines and whenever it gets a new result it appends it to the results file. Again this is a stable routine both in mem usage and in computation time (O(1) to read a result and append it to file) so we expect this too to scale well with input size.

Waitgroups are used to guard against premature program termination.

How to build

make build

How to run tests

make test (Runs unit tests first, end to end test afterwards)

Prerequisites

POSIX system to run e2e tests (tehy use bash and standard POSIX tooling such as sort)

Known issues

Atm we do not guard against the case of the first geo point of a drive being an outlier. We're out of time but we think that for a first iteration its probably an acceptable low risk (I guess every ride starts from a stationary position so the chances of the starting point being a faulty one are minimal even with dodgy GPS equipment).

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
resources		resources
Makefile		Makefile
README.md		README.md
e2e_test.sh		e2e_test.sh
go.mod		go.mod
haversine.go		haversine.go
haversine_test.go		haversine_test.go
unfare.go		unfare.go
unfare_test.go		unfare_test.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

unfare

input structure

challenges

architecture

How to build

How to run tests

Prerequisites

Known issues

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

unfare

input structure

challenges

architecture

How to build

How to run tests

Prerequisites

Known issues

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages