Virtual Environments

  • Virtual Taobao

    The technology of Polixir originated from building the Virtual-Taobao environment. Virtual-Taobao is the first simulator in the world that successfully virtualized a large-scale real-world scenario. In Virtual-Taobao, the agent is the recommender system that interacts with virtual buyers, in order to learn the best recommendation strategy.

Publications

  • On Value Discrepancy of Imitation Learning

    Imitation learning trains a policy from expert demonstrations. This paper provides a deep understanding of various imitation learning approaches, showing different compounding errors from different ideas.

  • Improving Fictitious Play Reinforcement Learning with Expanding Models

    Fictitious play is an effective framework for reinforcement learning in zero-sum games. Using deep neural networks as the policy models, the training faces issues of easy to forget old data and hard to mix-up models. This paper presents the expanding models to solve the issues.

  • Novelty-Prepared Few-Shot Classification

    Few-shot classification targets at a high accuracy from only a few samples, which is crucial for real-world applications. Our new approach is the first one that the learning model is open-world aware. Consequently our model adapts better to new classification tasks, and achieves significant improvement from the state-of-the-art methods.

2019-2020©Polixir Technology Co.,Ltd.