OPTIMIZING LONGITUDINAL VELOCITY CONTROL VIA SELF-SUPERVISED LEARNING AND DEEP DETERMINISTIC POLICY GRADIENT