Direct Parameterization of Lipschitz-Bounded Deep Networks

Summary: This paper (arxiv) introduces a new parameterization of deep neural networks (both fully-connected and convolutional) with guaranteed Lipschitz bounds, i.e. limited sensitivity to perturbations. The Lipschitz guarantees are equivalent to the tightest-known bounds based on certification via a semidefinite program (SDP), which does not scale to large models. In contrast to the SDP approach, we provide a direct parameterization, i.e. a smooth mapping from $\mathbb R^N$ onto the set of weights of Lipschitz-bounded networks. This enables training via standard gradient methods, without any computationally intensive projections or barrier terms. The new parameterization can equivalently be thought of as either a new layer type (the sandwich layer), or a novel parameterization of standard feedforward networks with parameter sharing between neighbouring layers.

A BibTeX entry for LaTeX users:

@misc{ruigang2023direct,
    title={Direct Parameterization of Lipschitz-Bounded Deep Networks}, 
    author={Ruigang Wang and Ian R. Manchester},
    booktitle={International Conference on Machine Learning},
    year={2023},
    organization={PMLR}
}

Experiments

Install required packages: einops, auto-attack (https://github.com/fra31/auto-attack)
Reprduce the experiments (pretrained models) by running commands below.

Lipschtz tightness

Toy example (Tab.1 & Fig.3): m=[train, eval], g=[1,5,10]

python main.py --mode [m] --model Toy --gamma [g] --layer Sandwich --scale small --dataset square_wave --epochs 200 
python main.py --mode [m] --model Toy --gamma [g] --layer Orthogon --scale small --dataset square_wave --epochs 200 
python main.py --mode [m] --model Toy --gamma [g] --layer Aol --scale small --dataset square_wave --epochs 200
python main.py --mode train --model Toy --gamma [g] --layer SLL --scale small --dataset square_wave --epochs 200

Empirical robustness

Fig. 4:

CIFAR-10 (top row): m=[train, eval], g=[1,10,100]

python main.py --mode [m] --model KWL --layer Sandwich --scale small --gamma [g] --dataset cifar10 --loss multimargin --epochs 100
python main.py --mode [m] --model KWL --layer Orthogon --scale small --gamma [g] --dataset cifar10 --loss multimargin --epochs 100
python main.py --mode [m] --model KWL --layer Aol --scale small --gamma [g] --dataset cifar10 --loss multimargin --epochs 100
python main.py --mode [m] --model KWL --layer Plain --scale small --dataset cifar10 --loss multimargin --epochs 100
python main.py --mode [m] --model Resnet --layer SLL --scale small --gamma [g] --dataset cifar10 --loss multimargin --epochs 100

CIFAR-100 (bottom row): m=[train, eval], g=[1,2,4]

python main.py --mode [m] --model KWL --layer Sandwich --scale large --gamma [g] --dataset cifar100 --loss xent --epochs 100
python main.py --mode [m] --model KWL --layer Orthogon --scale large --gamma [g] --dataset cifar100 --loss xent --epochs 100
python main.py --mode [m] --model KWL --layer Aol --scale large --gamma [g] --dataset cifar100 --loss xent --epochs 100
python main.py --mode [m] --model KWL --layer Plain --scale large --dataset cifar100 --loss xent --epochs 100
python main.py --mode [m] --model Resnet --layer SLL --scale large --gamma [g] --dataset cifar100 --loss xent --epochs 100

Fig. 5:

CIFAR-10 (left): s=[123,43,13,7,365]

python main.py --mode train --model KWL --layer Sandwich --scale small --gamma 100 --dataset cifar10 --loss multimargin --epochs 100 --seed [s]
python main.py --mode train --model KWL --layer Orthogon --scale small --gamma 100 --dataset cifar10 --loss multimargin --epochs 100 --seed [s]
python main.py --mode train --model KWL --layer Aol --scale small --gamma 100 --dataset cifar10 --loss multimargin --epochs 100 --seed [s]
python main.py --mode train --model KWL --layer Plain --scale small --dataset cifar10 --loss multimargin --epochs 100 --seed [s]
python main.py --mode train --model Resnet --layer SLL --scale small --gamma 100 --dataset cifar10 --loss multimargin --epochs 100 --seed [s]

CIFAR-100 (right): s=[123,43,13,7,365]

python main.py --mode train --model KWL --layer Sandwich --scale large --gamma 10 --dataset cifar100 --loss xent --epochs 100 --seed [s]
python main.py --mode train --model KWL --layer Orthogon --scale large --gamma 10 --dataset cifar100 --loss xent --epochs 100 --seed [s]
python main.py --mode train --model KWL --layer Aol --scale large --gamma 10 --dataset cifar100 --loss xent --epochs 100 --seed [s]
python main.py --mode train --model KWL --layer Plain --scale large --dataset cifar100 --loss xent --epochs 100 --seed [s]
python main.py --mode train --model Resnet --layer SLL --scale large --gamma 10 --dataset cifar100 --loss xent --epochs 100 --seed [s]

Tab. 2:

CIFAR-100: m=[train, eval], c=[small,medium,large], s=[123,43,13]

python main.py --mode [m] --model KWL --layer Sandwich --scale [c] --gamma 2 --dataset cifar100 --loss xent --epochs 100 --normalized --seed [s]
python main.py --mode [m] --model KWL --layer Orthogon --scale [c] --gamma 2 --dataset cifar100 --loss xent --epochs 100 --normalized --seed [s]
python main.py --mode [m] --model KWL --layer Aol --scale [c] --gamma 2 --dataset cifar100 --loss xent --epochs 100 --normalized --seed [s]
python main.py --mode [m] --model Resnet --layer SLL --scale [c] --gamma 2 --dataset cifar100 --loss xent --epochs 100 --normalized --seed [s]

Tiny-Imagenet: m=[train, eval], c=[small,medium,large], s=[123,43,13]

python main.py --mode [m] --model KWL --layer Sandwich --scale [c] --gamma 2 --dataset tiny_imagenet --loss xent --epochs 100 --normalized --seed [s]
python main.py --mode [m] --model KWL --layer Orthogon --scale [c] --gamma 2 --dataset tiny_imagenet --loss xent --epochs 100 --normalized --seed [s]
python main.py --mode [m] --model KWL --layer Aol --scale [c] --gamma 2 --dataset tiny_imagenet --loss xent --epochs 100 --normalized --seed [s]
python main.py --mode [m] --model Resnet --layer SLL --scale [c] --gamma 2 --dataset tiny_imagenet --loss xent --epochs 100 --normalized --seed [s]

Certified robustness

Tab. 3 & 4:

CIFAR-100: m=[train, eval], s=[123,43,13]

python main.py --mode [m] --model DNN --layer Sandwich --scale small --gamma 1 --dataset cifar100 --loss xent --epochs 400 --cert_acc --seed [s]

Tiny-Imagenet: m=[train, eval], s=[123,43,13]

python main.py --mode [m] --model DNN --layer Sandwich --scale small --gamma 1 --dataset tiny_imagenet --loss xent --epochs 400 --cert_acc --seed [s]

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
dataset.py		dataset.py
evaluate.py		evaluate.py
layer.py		layer.py
main.py		main.py
model.py		model.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Direct Parameterization of Lipschitz-Bounded Deep Networks

Experiments

Lipschtz tightness

Empirical robustness

Certified robustness

About

Releases

Packages

Languages

License

acfr/LBDN

Folders and files

Latest commit

History

Repository files navigation

Direct Parameterization of Lipschitz-Bounded Deep Networks

Experiments

Lipschtz tightness

Empirical robustness

Certified robustness

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages