Charles Camp, Developer in Barcelona, Spain
Charles is available for hire
Hire Charles

Charles Camp

Verified Expert  in Engineering

Machine Learning Developer

Location
Barcelona, Spain
Toptal Member Since
August 24, 2020

Charles拥有人工智能和数据科学两方面的认证. 他确实非常擅长制作高性能模型并使其易于使用. 他能很好地适应各种环境,已经在银行工作过了, startups, IT firms, and laboratories. 他的专业领域是自然语言处理和时间序列分析.

Portfolio

Non-Fungible Films, Inc.
Python,人工智能(AI),自然语言处理(NLP), GPT...
Global CPG Company
Python, Machine Learning, PySpark, Scikit-learn, Pandas
Phragmites, Inc.
Generative Pre-trained Transformers (GPT), GPT...

Experience

Availability

Part-time

Preferred Environment

Python, Amazon Web Services (AWS), Natural Language Processing (NLP), Time Series Analysis, Transformers, Reinforcement Learning

The most amazing...

...我建立的模型可以识别出与金融犯罪有关的人.

Work Experience

AI Developer

2023 - 2023
Non-Fungible Films, Inc.
  • 微调稳定扩散模型,可用于公司的虚拟字符.
  • 部署了一个类似Midjourney的Discord bot,但使用了自定义的稳定扩散模型.
  • 将模型与稳定的扩散UI集成在一起,以实现绘图, image to image, and other applications.
Technologies: Python,人工智能(AI),自然语言处理(NLP), GPT, Generative Pre-trained Transformers (GPT), Computer Vision, Machine Learning, Node.js

ML Engineer

2022 - 2023
Global CPG Company
  • 创建了一个管道,利用内部消费者行为数据自动计算相似的受众.
  • 比较模型以实现最高性能和超参数调优.
  • 创建自定义PySpark和scikit-learn估计器,以集成PySpark和scikit-learn管道, respectively.
技术:Python,机器学习,PySpark, Scikit-learn, Pandas

ML and NLP Engineer

2021 - 2022
Phragmites, Inc.
  • Set up an EC2 server, analyzed Telegram messages stored on a Postgres DB, 并将它们分类为与特定加密相关的项目相关或不相关.
  • 使用接近重复的集群方法构建了一个bot消息检测模型.
  • 使用图论量化Telegram用户在以加密为中心的对话中的影响.
  • 训练了一个NER模型来检测Telegram消息中的加密项目名称.
Technologies: Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Machine Learning, Trend Analysis, Sentiment Analysis, Cryptocurrency, Artificial Intelligence (AI), Google Cloud, Python, SQL, Pandas, Scikit-learn, Data Science, Natural Language Toolkit (NLTK), SpaCy, Hugging Face, Neural Networks, Communication, Elasticsearch

Senior Data Scientist

2021 - 2022
Trust & Safety Laboratory
  • 训练机器学习模型在推特中发现有争议的话题. 有争议的话题被定义为可能包含有害的错误信息.
  • 训练ML模型来检测推文中的虚假声明和错误信息.
  • Built a pipeline to collect human loop reviews (AWS), automated the labeling of potentially misleading tweets, and performed website scraping.
  • 开发了一个无服务器框架,使社交媒体筛选任务自动化.
Technologies: Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Docker, Kubernetes, Bazel, Machine Learning, Python, Scikit-learn, Pandas, Data Science, Linux, Amazon Web Services (AWS), Neural Networks, SpaCy, Artificial Intelligence (AI), Sentiment Analysis, Test Automation, Hugging Face

Python Developer | AI

2020 - 2021
Click Factura SA de CV
  • 转录和总结西班牙语音频会议:微调文本到语音模型(DeepSpeech), NeMo, 和Wav2Vec),并使用了文本摘要和日记模型.
  • 训练OCR模型提取墨西哥机票信息.
  • 通过创建api将模型集成到现有的Django应用程序中.
  • Deployed the models using Docker containers and Flask.
Technologies: Python, Machine Learning, TensorFlow, Test Automation, Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Artificial Intelligence (AI), Amazon Web Services (AWS), Kubernetes, Docker, Django, APIs, OCR, Speech to Text, SQL, Data Science, Linux, Neural Networks, Communication, Google Cloud, Hugging Face

Machine Learning Expert | Digital Advertisement

2020 - 2020
Primal Analytics
  • 部署Lambda自动检测谷歌广告统计中的异常情况.
  • 比较了用于时间序列异常检测的各种最先进的ML模型.
  • 设置用于数据存储和Lambda执行的AWS帐户.
Technologies: Machine Learning, Digital Advertising, Python, Time Series Analysis, Scikit-learn, Anomaly Detection, SQL, Pandas, Data Science, Amazon Web Services (AWS), Neural Networks, ARIMA Models, Artificial Intelligence (AI), Trend Analysis

Senior Data Scientist

2019 - 2020
Glovo
  • 设计、实现和部署客户生命周期价值模型. 我们使用Luigi将其部署在EC2实例上,并使用Jenkins进行调度.
  • Used linear programming to optimize pickers' time shifts.
  • 建立一个端到端的管道,根据产品在商店中可用的概率来决定是否在应用中显示产品,以改善客户体验. 该模型在SageMaker上进行训练,然后部署在EC2实例上.
Technologies: Linear Programming, TensorFlow, Data Science, Pandas, Machine Learning, Scikit-learn, Python, XGBoost, Redshift, Amazon SageMaker, Amazon Web Services (AWS), Luigi, SQL, Communication, Artificial Intelligence (AI), Anomaly Detection

Data Scientist

2016 - 2019
Credit Suisse
  • 设计和部署机器学习模型,使用交易数据检测洗钱行为.
  • 领导负面新闻筛选项目,自动筛选新闻数据,寻找与金融犯罪的关联,丰富风险评分模型.
  • 使用NLP来衡量新闻数据对金融产品销售的影响.
  • 组织大数据平台上各种交易和KYC数据源的数据来源和映射. 还处理了事务和KYC数据的数据模型的设计和实现,以促进事务监控.
Technologies: Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), TensorFlow, PySpark, Data Science, Pandas, Machine Learning, Scikit-learn, Python, SpaCy, SQL, Artificial Intelligence (AI), Communication, Natural Language Toolkit (NLTK), R, Time Series Analysis, ARIMA Models, XGBoost, Sentiment Analysis, Anomaly Detection, Hugging Face, Elasticsearch, Test Automation

Research Scholar

2016 - 2016
Carnegie Mellon University
  • 设计并实施了一个模型,利用患者的大脑活动数据(多变量时间序列)预测心脏骤停后患者的生存。.
  • 建立了一个评估,给早期预测生存的模型一个更好的分数.
  • 聚集患者以确定共同特征并推断出具体的预防措施以提高其生存率.
Technologies: ARIMA Models, Time Series Analysis, R, Data Science, Pandas, Machine Learning, Scikit-learn, Linux, Artificial Intelligence (AI), Trend Analysis, Anomaly Detection

Data Scientist Intern

2015 - 2015
Capgemini
  • 搭建Spark集群,从HDFS读取传感器数据并进行预处理.
  • 建立一个可扩展的监督模型,使用多变量时间序列数据(传感器数据)检测制造故障.
  • Fine-tuned and validated the model. Identified main features leading to breakdowns.
Technologies: ARIMA Models, Time Series Analysis, PySpark, Data Science, Pandas, Machine Learning, Scikit-learn, Python, Linux, XGBoost, Artificial Intelligence (AI), Anomaly Detection

Recommender System

我们得到一个维度矩阵V(用户数量,电影数量). 如果用户不喜欢这部电影,这个矩阵用0填充;如果用户喜欢这部电影,这个矩阵用1填充. The rest of the values are NaNs.

In the first step, 我们使用非负矩阵分解(NMF)来找到两个矩阵W和H各自的大小(用户数), K) and (K, number of movies) that minimize the difference between V and WH where K is a small value (< 10). That means we look for W and H such as WH is close to V.

Afterward, 我们使用W和H对用户进行聚类,现在可以推荐他们所分配的聚类会喜欢的电影.

Face and Image Recognition

使用网络摄像头和OpenCV来检测人脸的面部识别器. 该项目首先扩展到简单对象,然后再次扩展到自定义对象,重新训练预训练的最先进的TensorFlow模型.

Languages

Python, SQL, R

Libraries/APIs

Scikit-learn, Pandas, PySpark, SpaCy, Natural Language Toolkit (NLTK), XGBoost, TensorFlow, Luigi, OpenCV, Node.js

Paradigms

数据科学,异常检测,线性编程,测试自动化

Other

Time Series Analysis, Natural Language Processing (NLP), Machine Learning, Artificial Intelligence (AI), Communication, GPT, Generative Pre-trained Transformers (GPT), Neural Networks, ARIMA Models, Sentiment Analysis, Cryptocurrency, Hugging Face, Analysis of Variance (ANOVA), APIs, Speech to Text, OCR, Decentralized Finance (DeFi), Trend Analysis, Digital Advertising, Computer Vision, Reinforcement Learning, PEFT, LoRa, Transformers

Platforms

Linux, Amazon Web Services (AWS), Docker, Kubernetes

Frameworks

Django

Tools

Amazon SageMaker, Bazel

Storage

Redshift, Google Cloud, Elasticsearch

2014 - 2016

Master's Degree in Data Science

Grenoble Institute of Technology - Grenoble, France

2011 - 2014

Bachelor's Degree in Computer Science

Grenoble Institute of Technology - Grenoble, France

JULY 2023 - PRESENT

Generative AI with Large Language Models

Coursera

APRIL 2022 - PRESENT

Decentralized Finance (DeFi)

Coursera

FEBRUARY 2022 - FEBRUARY 2025

AWS Solutions Architect Associate

Pearson VUE

DECEMBER 2020 - PRESENT

Django for Everybody

University of Michigan | via Coursera