CVPR Daily - Saturday‏

13 DAILY CVPR Saturday result: GradTree, a new class of DTs that outperforms many baselines on tabular data by combining the inductive bias of trees with the optimization power of neural nets. Once we could optimize a single tree, we scaled it up to GRANDE — a fully differentiable, weighted ensemble of DTs that remains efficient, robust, and expressive. It achieves state-of-the-art results on tabular benchmarks and extends naturally to broader domains like reinforcement learning and multimodal learning, where it integrates seamlessly and delivers strong results. For instance, we demonstrated how GRANDE can act as a structured backbone for tabular inputs combined effectively with CNNs for image processing, or use tree-based heads as the final decision-making layer to boost interpretability. DTs are not outdated — we’ve just been using them the same way for over 40 years. This work shows how a structured, interpretable model can benefit from modern optimization tricks, and in doing so, opens the door to new use cases that previously felt out of reach. Sometimes, it’s worth going back to where you started — and teaching it some new tricks. Sascha Marton Figure 1. Greedy vs. Gradient-Based DT. Two DTs trained on the Echocardiogram dataset. The CART DT (a) makes only locally optimal splits, while GradTree (b) jointly optimizes all parameters, leading to significantly better performance. Figure 2. Standard vs. Dense DT Representation. Comparison of a standard (a) and its equivalent dense representation (b) for an exemplary DT with depth 2 and a dataset with 3 variables and 2 classes. Here, ℎ stands for a discretized logistic function.

RkJQdWJsaXNoZXIy NTc3NzU=