Nowadays, cloud computing is one of the most important topics in computing technology. With the ever-growing demands for a variety of services in cloud infrastructure, providing fast and affordable computing resources to users is one of the important topics in cloud computing, which has attracted a lot of attention. Since the resources are limited and must be fairly distributed among the cloud customers, managing and allocating them is challenging. This thesis addresses the problem of auto-scaling Docker containers for Micro-services in the cloud computing infrastructure. In this project, a prototype is designed and developed to reduce the complexity of dynamically provisioning resources for incoming requests in a cost-effective manner. The auto-scaling approach that is used in this thesis is divided into two parts, provisioning the virtual machines and placing the Docker containers inside them. In provisioning the resources cost has an important role. Two different methods, the Cosine Similarity-based placement algorithm and the Remaining Capacity-based (Volume) placement algorithm, are implemented to place the Docker containers inside the virtual machines. This approach is evaluated in three scenarios: testing the functionality and correctness of implemented algorithms, testing the QoS of the system by running the experiment for the batches of requests in a real cloud environment and simulating a large-scale experiment. The results showed that the implemented algorithms were able to auto-scale resources based on incoming requests at an affordable cost.