F3arwin [2024]

$$F(\delta) = \underbrace\mathbbI[f_\theta(x+\delta) \neq y] \cdot (1 - \textsoftmax(f_\theta(x+\delta)) y) \textMisclassification confidence - \lambda \cdot \frac\delta\epsilon \sqrtd$$

$$\theta_t+1 = \theta_t - \eta \nabla_\theta \frac1 \sum \delta \in \mathcalP \textadv L(f \theta(x+\delta), y)$$

f3arwin significantly outperforms prior genetic attacks due to adaptive mutation and SBX crossover, which preserves high-fitness perturbation structures. Compared to Square Attack, f3arwin requires 11% fewer queries for a similar ASR. On VGG-16 (unseen during attack generation), f3arwin perturbations crafted on ResNet-50 achieved 68.3% ASR, vs. 51.2% for Square Attack and 59.7% for standard genetic attack. This suggests that evolutionary perturbations capture more model-agnostic features. 5.3 Defensive Robustness | Defense Method | Clean Acc. | Robust Acc. (PGD) | Robust Acc. (f3arwin attack) | |----------------|------------|------------------|-------------------------------| | Standard | 92.1% | 0.3% | 0.1% | | PGD-AT | 88.4% | 51.2% | 43.5% | | TRADES | 87.9% | 53.1% | 46.2% | | f3arwin defense | 89.2% | 54.8% | 58.9% | f3arwin

[4] Madry, A., Makelov, A., Schmidt, L., Tsipras, D., & Vladu, A. (2018). Towards deep learning models resistant to adversarial attacks. ICLR .

(1) f3arwin requires more computational time than PGD-AT for large models (≈3× training slowdown due to population evaluation). (2) The attack may fail on models with extremely non-smooth decision boundaries where crossover becomes destructive. (3) For very high-dimensional inputs (e.g., 224×224×3), the perturbation search space remains challenging without dimensionality reduction. | Robust Acc

[5] Su, J., Vargas, D. V., & Sakurai, K. (2018). One pixel attack for fooling deep neural networks. IEEE Transactions on Evolutionary Computation .

[3] Ilyas, A., Engstrom, L., Athalye, A., & Lin, J. (2019). Black-box adversarial attacks with limited queries and information. ICML . f3arwin

[2] Goodfellow, I. J., Shlens, J., & Szegedy, C. (2015). Explaining and harnessing adversarial examples. ICLR .