Sport analytic systems have had an increasing interest in the past few years, with advancements automatizing otherwise tedious and time consuming tasks. However, these systems are still not able to perform semantic analysis of video footage, such as in automatic activity labeling (for example "run" and "walk") and performance analysis (for example reaction times). In this thesis, we propose a prototype, within the Bagadus sport analytic system, allowing to obtain action labels and athelete poses. With these, it is possible to automatically annotate (label) activities and aid in analyzing the athletes performances (e.g., identify wether they use the right technique when running) as the athelete pose will be known. Our prototype solves both the action recognition and pose estimation problems as a content-based video retrieval problem, that is, we first obtain athlete-centered video sequences and then compare these sequences against an annotated database. The comparison utilizes a video similarity measure, based on the motion occurring in the video sequences using optical flow. This approach allows us to obtain additional semantic information, consisting of action labels and poses and is suitable to the use-case of a soccer stadium where cameras are located at a distance from the atheletes. To support the proposed content-based video retrieval solution, a large set of athelete-skeletons (placement of joint lications for an athelete within a frame) is annotated using the crowdsourcing platform Microworkers. Using crowdworkers, we obtain high qualtiy skeletons that are comparable to expert annotations within a short period of time, which is achieved by iterating over several designs on the user interface and utilizing different filtering techinques on the annotations. The proposed method presents a good performance in terms of accuracy and robustness for both action recognition and pose estimation, which correcly classifies 78% of all the actions for a set of selected video sequences and estimates poses with up to pixel-perfect results. This allows us to extend the sport analytic system Bagadus capabilities by including semantic analysis of actions and poses, but can also be used in entertainment applications such as free-view rendering.