Introduction ============ Intro. ---------------- We propose a new modular framework for addressing the reality gap in the vision domain and navigating a robot via virtual signals. Our robot uses a single monocular camera for navigation, without assuming any usage of LIDAR, stereo camera, or odometry information from the robot. Features -------- - Automatically navigate the robot to a specific goal without any high-cost sensors. - Based on a single camera and use deep learning methods. - Use Sim-to-Real technology to eliminate the gap between virtual environment and real world. - Introduce Virtual guidance to entice the agent to move toward a specific direction. - Use Reinforcement learning to avoid obstacles while driving through crowds of people.