IPSC 2009 Solution to Problem L – Let there be rainbows!

IPSC 2009

Solution to Problem L – Let there be rainbows!

We will first show a solution to a special case that corresponded to the easy input: the case where the tree is in fact a simple path.

We can number the edges 1 to N - 1 in order in which they lie on the path. Each of the edges can have one of eight possible colors.

Let’s start by identifying the types of queries we need to process. We can break the query described in the problem statement into two parts:

given x, y and c, count the edges between x and y inclusive that have color c
given x, y and c, color all edges between x and y inclusive by color c

We will use an interval tree in order to answer each of the queries in O(log N).

In each vertex of the interval tree, we will store the following information:

a boolean stating whether the entire interval represented by this vertex has the same color
if yes, an integer id of that color
for each color, the number of edges in our interval that have this color (based on the information stored in our subtree only).

To process a query of the first type, we take the given interval [x,y], recursively descend down the tree and sum up the stored information. To process a query of the second type, we again recursively descend down the tree and update the information. One important issue that needs to be handled correctly are the vertices that were fully colored before we started processing the query. The easiest way how to handle these: whenever we encounter such a vertex during our descent, we mark it as not full, and instead mark both its children as fully painted using the parent’s color.

Before we present the general solution, one simple trick. Imagine that you run depth-first search on a rooted tree. For each vertex v, one can store two times: the time t_d(v) when it was discovered, and the time t_f(v) when we finished processing it.

From the nature of depth-first search it is obvious that for any two vertices v, w the intervals [t_d(v),t_f(v)] and [t_d(w),t_f(w)] are either disjoint or nested one inside another. And more precisely, the second case happens if and only if the vertex with the longer interval is an ancestor of the vertex with the shorter one.

This observation can be used in the following way: In O(N) we can precompute all values t_d(⋅) and t_f(⋅), and then we can use them to answer queries of the type “is x an ancestor of y?” in constant time.

We will now show a solution to the general problem. Our solution preprocesses a tree with N vertices in O(N) time, and then answers each query in O(log ²N) time. Hence the total time complexity of our solution will be O(N + Qlog ²N).

The first step in our solution is to compute the heavy-light decomposition of the given tree. We root the given tree at an arbitrary vertex, and for each vertex v we compute the size of its subtree s(v) (i.e., the number of vertices it contains). Now, let v be an arbitary vertex. If v has a child c such that s(c) ≥ s(v)∕2, then the edge v - c is called heavy, otherwise the edge is called light.

Note that each vertex can only have at most one child connected by a heavy edge. Hence the heavy edges in a tree form a set of vertex-disjoint paths.

Now we can take the entire tree and decompose it into at most N - 1 edge-disjoint paths as follows: Let S be the set of vertices that don’t have any child connected by a heavy edge. For each of these vertices we construct one path as follows: we start traveling towards the root, stop after we use the first light edge, and take all edges we used to form a single path. We’ll call these paths important paths and number them 1 to P.

What does this decomposition accomplish?

Imagine that you are traveling down the tree, starting at the root. How many times can you use a light edge? Clearly, lg N is an upper bound, because each time you use a light edge the number of vertices in your subtree is at least halved.

We just showed that for any vertex v the path from v to the root only contains at most lg N light edges. Now imagine that as we walk along the path from v to the root, we are keeping track of the important path we are on. This only changes when we use a light edge. And as there are at most lg N light edges on our path, we only change the important path at most lg N times.

Almost the same is true for a path between two arbitrary vertices v and w: Let x be their least common ancestor in the rooted tree. The path from v to w can be split into two paths, v to x and x to w. Each of these two paths only uses edges from at most lg N important paths, hence the entire path from v to w crosses at most 2lg N important paths.

For each important path, we will keep an interval tree that will be used in the same way as in the solution to the easy problem.

Hence to answer a query, we first identify the segments of important paths it uses. This can be done in O(log N). Then we process each segment separately. Processing each segment involves two queries on the interval tree for its important path, hence the total time complexity is O(log ²N) per query.

Internet Problem Solving Contest

IPSC 2009

Solution to Problem L – Let there be rainbows!