Is there any sector that Artificial Intelligence (AI) can’t transform?
If you’re following the headlines (and search trends) the answer may seem to be a hard ‘no.’ That’s because businesses across sectors have been making sweeping pronouncements recently tying their products and services to AI and machine learning (ML) principles.
This includes some of the largest legacy tax firms in the industry, who aren’t necessarily known for their technological prowess. The latest behemoth to follow suite is PricewaterhouseCoopers LLP (PwC), who are committing $1 billion toward adopting generative AI technology with the goal of automating their tax and auditing services.
As part of the initiative, PwC plans to work closely with Microsoft Corp. and OpenAI, the company behind ChatGPT and other high-profile generative AI tools, over the course of the next three years. The investments include developing new services powered by generative AI systems (ie. natural-language response tools), while also hiring and developing more talent specializing in the technology to offer AI consulting.
Even still, PwC is late to the fray in exploring generative AI—let alone having a solutions suite that can benefit customers today. And at the same time that legacy tax consultants are wholeheartedly embracing generative AI, cracks are already dinging the reliable facade of some of the most popular new tools.
ChatGPT scores low on tax exams
Researchers at Brigham Young University announced this week that they literally put ChatGPT to the test. Along with 186 other accredited universities, BYU compiled 20,000 accounting exam questions related to auditing, tax code, and information systems (among other topics) with varying difficulty, and prompted responses from ChatGPT as well as human practitioners to see how they match up.
Students—that is human tax pros—scored an average of 76.7 percent on the exam, while ChatGPT only scored 47.4 percent.
Most concerning for tax companies is that ChatGPT fared especially poorly on the tax, financial and managerial assessments, showing that AI bots lack the critical thinking skills to execute on the relevant mathematical processes.
To that end, ChatGPT was also a poor student when it came to short answer questions, only answering between 28-29 percent correctly.
“In general, higher-order questions were harder for ChatGPT to answer. In fact, sometimes ChatGPT would provide authoritative written descriptions for incorrect answers, or answer the same question in different ways,” the research said.
Considering that many EOY tax programs require accurate and authoritative reporting, the tendency of ChatGPT to purport inaccuracies with confidence is especially concerning.
Human expertise still trumps chatbots
The big takeaway from all of this is that while automation is a worthy goal for virtually any business, leaving humans completely out of the equation—especially with a process as important as tax preparation—is never the right call.
Additionally, while PwC is just starting to embark on leveraging generative AI to expand and refine their services, the team at Boast AI have long been leveraging AI-driven integrations on our platform to help organize and synchronize relevant customer data. This is ultimately to help ensure our human tax pros can work smarter and deliver the best results possible.
So while generative AI still holds significant promise, it’s unlikely to deliver the expertise businesses need to accurately file their taxes—let alone defend against potential audits.
Boast AI takes a white glove approach to our partnership with innovative founders to help ensure they are reaping all the possible federal funding and credits they qualify for without exhausting their resources. Request a demo with the team today to get started.