Big luma optimisations, minor pooling optimisations
[melted] / ChangeLog