Answer to: Recommended data structures/algorithms for checking peoples' availability schedules
Score: 4
I've been overthinking this problem. I think there's a simple answer that is easy to implement and should perform more than adequately for your needs.
This is easily solved using Postgres. The high-level logical structure is this:
I have my doubts about whether it's worthwhile but if you want to normalize this structure, you can add another table:
And then what you want is the difference of two sets:
users with availability that contains the event
users with assignments that conflict with the event
That is, you want the list of users for 1 without any of the users from 2.
Postgres has an EXCEPT clause for this purpose. It works similarly to a UNION clause in that you use it between two compatible select statements. But instead of combining the results, the results from the second query are removed. With the simple design above (the first diagram) the query would look like this:
SELECT a.id
FROM availability a
WHERE a.start <= :event_start AND :event_start < a.end
AND a.start < :event_end AND :event_end <= a.end
EXCEPT
SELECT a.id
FROM assignment a
WHERE (:event_start <= a.start_date AND a.start_date < :event_end)
OR (:event_start <= a.end_date AND a.end_date < :event_end)
OR (a.start <= :event_start AND :event_start < a.end
AND a.start < :event_end AND :event_end <= a.end)
The above is more clearly written as:
SELECT av.id
FROM availability av
WHERE (:event_start BETWEEN av.start AND av.end)
AND (:event_end BETWEEN av.start AND av.end)
EXCEPT
SELECT as.id
FROM assignment as
WHERE (as.start_date BETWEEN :event_start AND :event_end)
OR (as.start_end BETWEEN :event_start AND :event_end)
OR ((:event_start BETWEEN as.start AND as.end)
AND (:event_end BETWEEN as.start AND as.end))
Note: The BETWEEN operator treats the endpoint as inclusive. The first SQL query above (without between) treats the end as exclusive. I would generally prefer exclusive endpoints but this fact might tip the balance in favor of inclusive end dates.
For performance, you should add range searchable indexes to these tables. Some index types will only work with exact matches and will not be used in this kind of query. If you have the appropriate indexes on this, it should execute efficiently. You can also use a NOT IN condition instead of an EXCEPT clause if you prefer. I expect the general performance of the two should be the same. Consult a DBA or the DBA stack for more detailed performance guidance for Postgres.
Technically, that last OR condition only needs to check for either the start or the end of the new event being found with an assigned period. I'm leaving both for clarity.
This answer uses scalar types for simplicity. You should consider the range types in Basilev's answer. They look to be especially useful for constraints around this kind of problem. And if you do go with a single ranged value for the periods, I don't think there's any reason use a fourth table as in the second relationship diagram above.
View Question ↗
Question
Parent Entity
Score: 5 • Views: 595
Site: softwareengineering
Other Comments / Reviews
SaaS Metrics