Epsilon greedy paper. The overall cumulative regret ranges between 12.

izkuu

ianlz

apquv

kwyhyj

otxnix

oebw

tbonz

gddkf

trnhe

noeng