Neural networks (NNs) have gained significant attention in the physics community because of their ability to find non-trivial patterns in large datasets. However, developing a theory of NN learning has proven to be quite challenging because of the vast number of degrees of freedom in a typical NN. But fortunately, statistical field theory already provides tools for analyzing similar many-body problems. Infinite-width NNs correspond to free field theories, while finite widths give rise to interactions; signals propagating through a network can be thought of as a renormalization group flow where the marginal couplings are hyperparameters of the network tuned to criticality to prevent exponential growth or decay of signals. We study the effect of initializing a network with weights sampled from an orthogonal matrix distribution and find several key features which indicate that networks with orthogonal initialization might perform better than those with Gaussian initialization throughout training.