Follow
Piotr Stanczyk
Piotr Stanczyk
Verified email at google.com
Title
Cited by
Cited by
Year
Gemini: a family of highly capable multimodal models
G Team, R Anil, S Borgeaud, Y Wu, JB Alayrac, J Yu, R Soricut, ...
arXiv preprint arXiv:2312.11805, 2023
14902023
Google research football: A novel reinforcement learning environment
K Kurach, A Raichuk, P Stańczyk, M Zając, O Bachem, L Espeholt, ...
Proceedings of the AAAI conference on artificial intelligence 34 (04), 4501-4510, 2020
3872020
Acme: A research framework for distributed reinforcement learning
MW Hoffman, B Shahriari, J Aslanides, G Barth-Maron, N Momchev, ...
arXiv preprint arXiv:2006.00979, 2020
2512020
What matters in on-policy reinforcement learning? a large-scale empirical study
M Andrychowicz, A Raichuk, P Stańczyk, M Orsini, S Girgin, R Marinier, ...
arXiv preprint arXiv:2006.05990, 2020
2262020
What matters for on-policy deep actor-critic methods? a large-scale study
M Andrychowicz, A Raichuk, P Stańczyk, M Orsini, S Girgin, R Marinier, ...
International conference on learning representations, 2021
1772021
Seed rl: Scalable and efficient deep-rl with accelerated central inference
L Espeholt, R Marinier, P Stanczyk, K Wang, M Michalski
arXiv preprint arXiv:1910.06591, 2019
1462019
Factually consistent summarization via reinforcement learning with textual entailment feedback
P Roit, J Ferret, L Shani, R Aharoni, G Cideron, R Dadashi, M Geist, ...
arXiv preprint arXiv:2306.00186, 2023
552023
Gkd: Generalized knowledge distillation for auto-regressive sequence models
R Agarwal, N Vieillard, P Stanczyk, S Ramos, M Geist, O Bachem
arXiv preprint arXiv:2306.13649, 2023
462023
Gemma 2: Improving open language models at a practical size
G Team, M Riviere, S Pathak, PG Sessa, C Hardin, S Bhupatiraju, ...
arXiv preprint arXiv:2408.00118, 2024
402024
What matters in on-policy reinforcement learning
M Andrychowicz, A Raichuk, P Stanczyk, M Orsini, S Girgin, R Marinier, ...
A large-scale empirical study. CoRR, abs/2006.05990 3, 2020
312020
On-policy distillation of language models: Learning from self-generated mistakes
R Agarwal, N Vieillard, Y Zhou, P Stanczyk, SR Garea, M Geist, ...
The Twelfth International Conference on Learning Representations, 2024
262024
Launchpad: A programming model for distributed machine learning research
F Yang, G Barth-Maron, P Stańczyk, M Hoffman, S Liu, M Kroiss, A Pope, ...
arXiv preprint arXiv:2106.04516, 2021
212021
Perfect Matching for Biconnected Cubic Graphs in O(n log2 n) Time
K Diks, P Stanczyk
SOFSEM 2010: Theory and Practice of Computer Science: 36th Conference on …, 2010
162010
Rlds: an ecosystem to generate, share and use datasets in reinforcement learning
S Ramos, S Girgin, L Hussenot, D Vincent, H Yakubovich, D Toyama, ...
arXiv preprint arXiv:2111.02767, 2021
142021
Generalized knowledge distillation for auto-regressive language models
R Agarwal, N Vieillard, Y Zhou, P Stanczyk, S Ramos, M Geist, O Bachem
The Twelfth International Conference on Learning Representations, 2024
72024
Bond: Aligning llms with best-of-n distillation
PG Sessa, R Dadashi, L Hussenot, J Ferret, N Vieillard, A Ramé, ...
arXiv preprint arXiv:2407.14622, 2024
52024
400 Carlos Riquelme, Damien Vincent, Marcin Michalski, Olivier Bousquet, et al. Google research 401 football: A novel reinforcement learning environment
K Kurach, A Raichuk, P Stanczyk, M Zajac, O Bachem, L Espeholt
arXiv preprint arXiv:1907.11180 402 (10), 2019
52019
Google research football
K Kurach, A Raichuk, P Stanczyk, M Zajac, O Bachem, L Espeholt, ...
A” Novel Reinforcement Learning Environment”, CoRR, 2019
52019
What matters in on-policy reinforcement learning? a large-scale empirical study (2020)
M Andrychowicz, A Raichuk, P Stanczyk, M Orsini, S Girgin, R Marinier, ...
arXiv preprint arXiv:2006.05990, 2006
32006
SIO .NET Plug&Play Contest System
M Michalski, M Kosieradzki, W Rygielski, P Stańczyk, K Ciebiera, K Diks
Perspectives on Computer Science Competitions for (High School) Students, 2005
32005
The system can't perform the operation now. Try again later.
Articles 1–20