This is a frequent question in interviews, but recently I came upon this in my general work - so shall retell it almost "as is" instead of inventing something artificial.
My current company (Gelato) helps people print everything on anything with use of various partner print-houses in various places over the world - and then deliver production where requested by various partner delivery services.
One of the teams is dedicated to tracking - which is very important for customer. Several servers collect information from these various delivery companies and show it to users in a unified way.
There are three important tables in this tracking system:
package
- something, which needs to be delivered,attempt
of delivery, sometimes several attempts for the same packageevent
- generally there are several delivery events per attempt,
like "created", "picked up", "in transit" etc.As said, there are many delivery services and sometimes updates from them are delayed or won't come at all. To investigate such situation, we want a query which gives tracking codes of all packages, which have no any events, except "created" (which really is set by us, rather than comes from shipment provider).
To make problem closer to "industrial reality", we do not elaborate further. Please, study the sample data! "Study" - means not only visually, but perhaps with few test queries :)