In order to apply backpropagation, the activation function needs to be derivable. Unfortunately, when the activation function is a step function, it cannot be derived. A workaround is to use a sigmoid activation function with a steep curve.
Make a as large as you want for the curve to be steep enough.