Porcelain Publishing / IVC / Volume 2 / Issue 1 / DOI: 10.47297/ppiivc2026020107
ARTICLE

Spatial Artificial Intelligence in Video Generators and Beyond

Yuxiang Dong1
Show Less
1 Department of Art and Art History, University of Miami, Coral Gables, FL 33146, USA
Published: 12 March 2026
© 2026 by the Author(s). This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/)
Abstract

In recent years, the development of video generators has garnered significant attentions both among the public and the academia because of its potential of paradigmatic changes in the industry of video and film production. Current AI video generators still yield to many front-end technical limitations, especially their capacities to understand complicated spatial relationships between characters, objects, and environments in the generated videos, causing the problem of inconsistency and uncontrollability in moving images. However, the recent development of spatial artificial intelligence and world models may provide a solution by building virtual studios. These rapid developing fields may lead to a wide scope of achievements beyond video and film production to autonomous systems, metaverse, and even Artificial General Intelligence. However, these technologies may also result in intensifying problems of misinformation and disinformation, which demand the collaboration between human intelligence and artificial intelligence for transformative changes.

Keywords
Spatial Artificial Intelligence
Spatial AI
Spatial Intelligence
AI-Generated Video
World Model
References

[1] AlShaghroud, S., AlShuwaier, A., & AlRakaf, L. (2023). Artificially intelligent and interactive 3D hologram. In Stephanidis, C., Antona, M., Ntoa, S., Salvendy, G. (Eds.), HCI International 2023 Posters. HCII 2023. Communications in Computer and Information Science, vol 1836. Springer, Cham. Retrieved from https://doi. org/10.1007/978-3-031-36004-6_50

[2] Bassyouni, Z., & Elhajj, I. H. (2021). Augmented reality meets artificial intelligence in robotics: A systematic review. Frontiers in Robotics and AI, 8, 724798. https://doi.org/10.3389/frobt.2021.724798

[3] Belfiore, N. P., & Di Benedetto, A. (2000). Connectivity and redundancy in spatial robots. The International Journal of Robotics Research, 19(12), 1245-1261. https://doi.org/10.1177/027836400220680

[4] Chatterji, A., Cunningham, T., Deming, D., Hitzig, Z., Ong, C., Shan, C. Y., & Wadman, K. (2025). How people use chatgpt (No. w34255). National Bureau of Economic Research. Retrieved from https://cdn.openai.com/pdf/a253471f-8260-40c6-a2cc-aa93fe9f142e/economic-research-chatgpt-usage-paper.pdf

[5] Crabtree, A., & Rodden, T. (2008). Hybrid ecologies: Understanding cooperative interaction in emerging physical-digital environments. Personal and Ubiquitous Computing, 12(7), 481-493. https://doi.org/10.1007/s00779-007-0142-7

[6] De Masi, V., Di, Q., Li, S., & Song, Y. (2025). Design principles for AI-assisted filmmaking: Lessons from 'Our T2 Remake' and beyond. Contemporary Visual Culture and Art, 1(1), 1-22. https://doi.org/10.63385/cvca.v1i1.60

[7] Hayles, N. K. (1999). How we became posthuman: Virtual bodies in cybernetics, literature, and informatics. University of Chicago Press.

[8] Jiang, F., Ma, J., Webster, C. J., Chiaradia, A. J., Zhou, Y., Zhao, Z., & Zhang, X. (2024). Generative urban design: A systematic review on problem formulation, design generation, and decision-making. Progress in Planning, 108, https://doi.org/10.1016/j.progress.2023.100795

[9] Kameas, A. & Saffiotti, A. (2012). Editorial. Special issue on "Ambient Ecologies". Pervasive and Mobile Computing, 8(4), 483-484. https://doi.org/10.1016/j.pmcj.2012.07.005

[10] Komninos, N. (2011). Intelligent cities: Variable geometries of spatial intelligence. Intelligent Buildings International, 3(3), 172-188. https://doi.org/10.1080/17508975.2011.579339

[11] Li, F. F. (2025). From words to worlds: Spatial intelligence is AI's next frontier. Retrieved from https://drfeifei.substack.com/p/from-words-to-worlds-spatial-intelligence

[12] Mozer, M. C. (2004). Lessons from an adaptive Home. In D. J. Cook & S. K. Das (Eds.), Smart Environments: Technology, Protocols, and Applications (pp. 273-298). Wiley. Retrieved from https://doi.org/10.1002/047168659X.ch12

[13] Papadimitriou, F. (2025). Spatial artificial intelligence. In Springer Briefs in Applied Sciences and Technology (Vol. 66): Springer Nature Switzerland. https://doi.org/10.1007/978-3-031-82136-3

[14] Patel, D., & Bhalodiya, P. (2019, November). 3D holographic and interactive artificial intelligence system. In 2019 International conference on smart systems and inventive technology (ICSSIT) (pp. 657-662). IEEE. https://doi. org/10.1109/ICSSIT46314.2019.8987926

[15] Ramesh, A., Dhariwal, P., Nichol, A., Chu, C., & Chen, M. (2022). Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125, 1(2), 3. Retrieved from https://arxiv.org/pdf/2204.06125

[16] Singh, A. (2023). A survey of AI text-to-image and AI text-to-video generators. In 2023 4th International Conference on Artificial Intelligence, Robotics and Control (AIRC), IEEE, 2023, 32-36. Retrieved from https://arxiv. org/pdf/2311.06329

[17] Sung, J., Grinter, R. E., & Christensen, H. I. (2010). Domestic robot ecology: An initial framework to unpack long-term acceptance of robots at home. International Journal of Social Robotics, 2(4), 417-429. https://doi.org/10.1007/s12369-010-0065-8

[18] Surie, D., Janlert, L. E., Pederson, T., & Roy, D. (2012). Egocentric interaction as a tool for designing ambient ecologies—The case of the easy ADL ecology. Pervasive and Mobile Computing, 8(4), 597-613. https://doi. org/10.1016/j.pmcj.2011.12.004

[19] Soliman, M. M., Ahmed, E., Darwish, A., & Hassanien, A. E. (2024). Artificial intelligence powered Metaverse: Analysis, challenges and future perspectives. Artificial Intelligence Review, 57(2), 36. https://doi.org/10.1007/s10462-023-10641-x

[20] Stephenson, N. (1992). Snow Crash. Bantam Books.

[21] Thakur, S. S., Bandyopadhyay, S., & Datta, D. (2023). Artificial intelligence and the metaverse: Present and future aspects. In Hassanien, A.E., Darwish, A., & Torky, M. (Eds.), The future of metaverse in the virtual era and physical world (pp. 169-184). Springer, Cham. Retrieved from https://doi.org/10.1007/978-3-031-29132-6_10

[22] Xuanjiang. (2025). Unveiling the ultimate AI video generation workflow! How to "Shoot" high-definition cinematic masterpieces with AI? From storyboarding to final editing, a step-by-step tutorial with recommended tools and teaching materials [video file]. Retrieved from https://www.youtube.com/watch?v=CuGUoGRmqtc

[23] Zender, H., Mozos, O. M., Jensfelt, P., Kruijff, G. J., & Burgard, W. (2008). Conceptual spatial representations for indoor mobile robots. Robotics and Autonomous Systems, 56(6), 493-502. https://doi.org/10.1016/j.robot.2008.03.007

[24] Zhang, L., & Li, J. (2025). The current landscape, challenges, and optimized pathways of AI video generation model. Intelligent Visuals and Communication, 1(1), 85-96. https://doi.org/10.47297/ppiivc2025010108

[25] Zhong, W. (2024). Application of artificial intelligence digital holography technology based on medical sensors in the development of medical image fusion. Measurement: Sensors, 33, 101146. https://doi. org/10.1016/j.measen.2024.101146

Share
Back to top
Intelligent Visuals and Communication, Electronic ISSN: 2978-5499 Print ISSN: 2978-5480, Published by Porcelain Publishing