SQL Connection Traps. Must Know for Data Engineer/Data Scientist
If there is ‘data’ in your job title, then you must be aware about what is described in this article.
If there is ‘data’ in your job title, then you must be aware about what is described in this article.

Photo by michael podger on Unsplash
The awareness of traps described in this article may prevent you from having a false result while querying a database.
By the end of this article you will:
know what are the connection traps in SQL
be able to resolve the problem by restructuring a weakly built SQL model
be aware of the existence of connection traps so that you could avoid having a ‘false positive’ query result
We will discuss 2 kinds of connection traps that appear in some databases, especially when database is built to serve one particular application with some well-defined functionality. Even though such database model will work well for what you need now, one day the same database may be used (queried) in a different way. Probably, this will be the day you realize that some part of connection information is lost due to incorrect modeling.
Taking all into account, I tried my best to make this article 100% practical.
Business Case
You are asked to build a relational model for a car market chain.
There are few shops belonging to the chain and each shop sells many different cars. For the purpose of this article’s problematic we leave 3 main entities: car shop, employee and car.
Assume that a description that is given from the chain owner is as follows:
Each car shop has N employees and N cars. Each car will be sold by a particular employee because all cars are assigned to employees (so that employee’s performance is easy to track).
Also, there is a type of employee who works only with luxury cars and, due to the fact that there are not enough luxury cars per each shop, those employees have to supervise the cars from few shops.

A relational representation would be:
1:N relation between employee and car
N:N relation between the shop and employee because one shop may have multiple employees and some employees will work in few shops

Entity relation diagram
Chasm Trap
Chasm trap — the pathway from one entity to another seems existing but it may be broken in some cases.
Chasm Trap Problem
You have to count how many cars are there per each shop. To do so, you have to write a query that will count the number of cars assigned to each employee of each shop. All seems correct, however, the result differs from what you should get. How did it happen?
Cause of a Chasm Trap
Some new cars just arrived, and they have not been assigned yet, for example. According to our model, those cars are not connected to any shop while it is not true in reality.

‘Car 7’ belongs to ‘Shop 1’ but we cannot deduct this information until car is unassigned to an employee
In example above we observe ‘Shop 1’ that has 8 cars and 5 employees. Due to a fact that ‘Car 7’ is still not assigned to any employee, we do not see it in stock. Basically, it lives in our car table and without an outsider’s knowledge (shop manager, for example) it cannot be assigned to a shop. Therefore, database itself is not reliable anymore since it is not capable to assign a car to a shop.
How You Could Avoid Having a Chasm Trap
First, you should have thought about such case while collecting the requirement.
Second, what does one to many mean? Is 1 to 0 still one to many? Some people will say yes and they will be right. For example, 1:0 is a subset of 1:N.
No specified lower bound is a problem.
My advice is to always choose between one of those two options:
One to (zero or many)
One to (one or many)
In such case, there will not be any space left for ambiguity.

Try to avoid using top 2 types of connection explainers
When you check your entity relation graph and it seems to have the path from one entity to another through some other entities, be sure that there is no xyz to (zero or many) OR xyz to (zero or one) paths.
Fan Trap
Fan trap — the pathway from one entity to another is ambiguous.
Fan Trap Problem
Fan traps are trickier comparing to chasm traps because something that is ambiguous can be sometimes missed or misinterpreted while chasm traps will always set an alarm.
Following our example, the fan trap will appear when a luxury car selling employee will be part of the equation.
Cause of a Fan Trap
This type of employee (one particular employee) may have 1:N relation to car shop and 1:N relation to cars. Consequently, when you try to connect car shop with cars, you do not have a way to identify correctly which car belongs to which shop.

Shops and cars are blades of a fan with employee being a center of a fan
Employee entity is like a messy roll of many unicolor strings that you will never be able to separate.
Now, even if all the connections are xyz to (one or many) you will not be able to correctly identify the cars in a specific car shop because the pathway is ambiguous.
Solution for Fan Trap
Shop → Car → Employee
Such modeling will resolve the fan trap, however, employees without cars will still be lost in chasm trap. Especially for the case of the luxury car sellers that will be a case.
Solution for Both Traps
(Shop → Employee → Car) + (Shop → Car)

Add one additional connection between shop and a car
When we add one additional connection that jumps over the employee entity we eliminate both, fan and chasm, traps.
It may seem like a redundant connection, but it is not a redundancy — it is a right way to structure the model.
The Worst Case Scenario Example
Imagine that you still have both traps simultaneously. You perform a simple check by making a query that counts the number of cars in stock in one shop and you get a correct result. However, one car has not been assigned yet (-1) and one employee who manages luxury cars had luxury car (+1) from other shop. Those two effects just neutralized each other. You validate your check and roll the model into production. Consequences may be catastrophic for business.
Unless you know about connection traps you will never raise an alarm and ask yourself the right questions.
Recap & Conclusion
Connection traps appear when a pathway between two or more entities is either ambiguous or false existent.
Relational databases seem like something straightforward, however, correctly written query may not always lead to a correct result.
Try to always specify the lower bound of connection.
Be aware of the SQL connection traps that may occur during modeling of a database or while querying a third party database, otherwise, you may make a small mistake that will lead to a huge problem in the future.
I hope one day this information will help you.