Henrique Lobato β Data Engineering
**Tech Lead |
Senior Data Engineer |
ETL Specialist** |
Professional Summary
π Data Engineering Specialist
Focused on building scalable ETL pipelines, optimizing data ingestion processes, and delivering high-throughput batch and real-time data workflows. Over 10 years of experience designing and implementing robust data systems processing 10M+ records daily across industries. Expert in SQL-based architectures, PostgreSQL optimization, Apache Spark/PySpark, and ensuring data pipeline resiliency and validation.
Technical Skills
Core Data Technologies
- Languages: Python (10+ years), SQL, PySpark
- Databases: PostgreSQL, MySQL, Snowflake, DynamoDB, Redis
- Big Data: Apache Spark, Hadoop, Data Lakes, Data Warehousing
- ETL/ELT: Custom Python ETL tooling, Data Pipeline Design, Batch Processing
- Processing: Streaming Data, Batch Processing, Real-time Analytics
Data Infrastructure & Optimization
- Cloud Data Services: AWS (S3, RDS, Redshift), GCP BigQuery
- Orchestration: Airflow, Jenkins, Custom Scheduling Solutions
- Containerization: Docker, Kubernetes for Data Workloads
- Storage Solutions: S3, HDFS, Optimized Data Formats (Parquet, Avro)
- Performance: Query Optimization, Indexing Strategies, Data Partitioning
Data Quality & Governance
- Validation: Schema Validation, Data Quality Checks, Reconciliation
- Monitoring: Pipeline Monitoring, Data SLAs, Alerting
- Documentation: Data Dictionaries, Pipeline Documentation, Data Lineage
- Security: Data Encryption, Access Controls, Compliance (GDPR, CCPA)
Professional Experience
OneTrust
Senior Data Engineer | ETL Architecture Lead β Remote β USA
Jun 2023 β Present
- Architected and led data migration platform from Convercent with 99.9% accuracy
- Designed robust PostgreSQL integrations for compliance data ingestion and validation
- Reduced data processing time by 45% through batch optimization and parallelization
- Implemented comprehensive data validation framework ensuring data integrity
- Created custom ETL pipelines for sensitive compliance data with full audit trails
- Engineered automated data reconciliation processes between source and target systems
IKTech
Senior Data Engineer | Pipeline Architect β Campinas, Brazil (Remote)
Nov 2020 β Present
- Designed high-throughput data pipelines processing 5TB+ daily for agricultural analytics
- Optimized PostgreSQL query performance reducing processing time by 60%
- Created real-time crop analysis data workflows using custom ETL solutions
- Implemented comprehensive data quality checks and validation frameworks
- Maintained >95% test coverage across all data processing systems
- Built data distribution systems with Redis for high-availability analytics
SecurityScorecard
Senior Data Engineer | Data Pipeline Specialist β New York, USA (Remote)
Jan 2022 β Jun 2023
- Architected scalable data ingestion pipelines for security metrics processing
- Built automated data quality monitoring systems for 10M+ daily records
- Designed security-focused data warehousing solution with strict access controls
- Developed resilient ETL workflows with comprehensive error handling and retry logic
- Implemented real-time data streaming architecture for security event monitoring
- Created PostgreSQL optimization strategies reducing storage requirements by 35%
BairesDev
Senior Data Engineer | ETL Specialist β Canada (Remote)
Dec 2020 β Jun 2021
- Developed PostgreSQL data backends with optimized SQLAlchemy layers
- Built scalable data ingestion pipelines with comprehensive validation
- Implemented data partitioning strategies improving query performance by 85%
- Created automated data quality assurance processes for sensitive financial data
- Designed data extraction services from legacy systems with 100% accuracy
Senior Data Engineer | Big Data Architect β Brazil
2019 β 2020
- Built PySpark/Hadoop ML pipelines for Serasa scoring systems processing 10M+ records daily
- Architected data lake solution for Globo.comβs content analytics platform
- Designed real-time data streaming solution processing 500K+ events/hour
- Implemented data quality monitoring framework for critical financial data
- Created ETL workflows for credit scoring models with comprehensive validations
GPr Sistemas
Data Engineer | Analytics Specialist β Brazil
Jul 2019 β Nov 2019
- Developed real-time monitoring data platform for 10K+ ATM devices
- Designed time-series data storage solution with efficient querying capabilities
- Built analytics dashboard ingesting and processing network monitoring data
- Achieved <1s data processing latency for critical financial metrics
- Implemented data archiving and retention policies for compliance requirements
Sintecsys
Data Engineer | Image Processing Specialist β Brazil
Apr 2019 β Jul 2019
- Built data pipeline processing 100K+ satellite images daily for wildfire detection
- Designed metadata extraction and indexing system for rapid image retrieval
- Reduced data processing time by 75% through optimization and parallelization
- Implemented fault-tolerant storage solution for critical environmental data
- Created ETL workflows for ML model training with comprehensive data validation
Multiway (Smart City)
Data Engineer | ALPR Systems β Brazil
Jan 2016 β Feb 2019
- Engineered data processing system handling 1M+ vehicle records daily
- Designed efficient data storage architecture for license plate recognition data
- Built ETL pipelines for vehicle tracking data with geospatial indexing
- Achieved 70% improvement in data processing efficiency and storage utilization
- Implemented data retention and anonymization policies for regulatory compliance
Certifications & Education
- B.Sc. in Computer Science β University of London
- AWS Certified Cloud Practitioner
- Database Optimization & Data Pipeline Design Certifications
- Advanced PostgreSQL Administration & Performance Tuning
Languages
- π§π· Portuguese: Native
- πΊπΈ English: Fluent
- π·πΊ Russian: Basic