Unbreak.info - DARTS Demo

Adaptive Module Parameters

Policy Choose an explore/exploit policy.

Bayes UCB Scale Bayes UCB Scale only valid for the Bayes UCB Policy.

Epsilon Epsilon only valid for Epsilon-Greedy policy.

Explore/Exploit Policy

Explore/Exploit Policies dictate the general behavior of how the Adaptive Module chooses to explore target pools it does not know much about or exploit the target pool it thinks is best.

UCB1: relies on an upper confidence bound to quantify how uncertain the Adaptive Module is about a target pool.
Bayes UCB: similar to UCB1, but changes how the upper confidence bound works to gain confidence in target pools more quickly.
Epsilon-Greedy: balances exploration and exploitation with a simple heuristic: Explore sub-optimal target pools a fraction of the time equal to epsilon.

Bayes UCB Scale

Parameter applicable to the Bayes UCB policy. Controls the size of the upper confidence bound by choosing the number of standard deviations of the rewards earned by each target pool to take into account.

Epsilon

Parameter applicable to the Epsilon-Greedy policy. Higher values explore more often whereas lower values lean towards exploiting the best performing target pool.

Allocation Strategy Parameters

Greed Factor Choose a greed factor. Increase to exploit best target pool more often.

Picking Strategy Choose a picking strategy.

Picking Order Choose a picking order.

Greed Factor

Greed Factor is a general factor that modifies the scores output by an explore/exploit policy. The higher the factor, the greedier any of the policies will become.

Picking Strategy

Dictates how targets are picked from the target pools.

Round-Robin: picks one target at a time from each target pool, rotating through the pools until allocations have been exhausted. Fairest method.
Greedy: picks all of the targets at one time from the target pool with the best performance. Biases top performing target pool.
Altruist: picks all of the targets at one time from the target pool with the worst performance. Biases worst performing target pool.

Picking Order

Determines which targets should be prioritized when picking from a target pool.

Best: if the target pool assigns scores to targets, picks the highest scoring targets first. If it does not, picks the first target in the list of targets.
Worst: if the target pool assigns scores to targets, picks the lowest scoring targets first. If it does not, picks the last target in the list of targets.
Random: picks a target from the target pool at random.