We present a position and orientation controller for a hybrid rigid-soft manipulator arm where the soft arm is extruded from a two degrees-of-freedom rigid link. Our approach involves learning the dynamics of the hybrid arm operating at 4 Hz and leveraging it to generate optimal trajectories that serve as expert data to learn a control policy. We performed an extensive evaluation of the policy on a physical hybrid arm capable of jointly controlling rigid and soft actuation. We show that with a single policy, the arm is capable of reaching arbitrary poses in the workspace with 3.73 cm (<6% overall arm length) and 17.78 deg error within 12.5 seconds, operating at different control frequencies, and controlling the end effector with different loads. Our results showcase significant improvements in control speed while effectively controlling both the position and orientation of the end effector compared to previous quasi-static controllers for hybrid arms.